zip-rs / zip-old

Zip implementation in Rust
MIT License
731 stars 204 forks source link

Can't copy files from the file being written to #364

Closed Pr0methean closed 1 year ago

Pr0methean commented 1 year ago

My app needs to duplicate some compressed files after the originals have been written to the Cursor<Vec<u8>> that eventually becomes the ZIP file, and the uncompressed originals have been dropped. The only way I've been able to find to accomplish that without a decompression round-trip is (where ZIP is a Mutex<ZipWriter<Cursor<Vec<u8>>>>, the function return type is Result<(),CloneableError> and anyhoo! is a macro similar to anyhow! but that produces a CloneableError):

    let mut zip = ZIP.lock().map_err(|error| anyhoo!(error.to_string()))?;
    let writer = zip.deref_mut();
    // Need to finish and consume the writer before switching to a reader. (For some reason, refs to
    // the underlying buffer don't reliably see updates, even when using a mutex so the read
    // happens-after the write.)
    let file_so_far = writer.finish().map_err(|error| anyhoo!(error))?;
    let mut reader = ZipArchive::new(Cursor::new(file_so_far.get_ref()))
        .map_err(|error| anyhoo!(error))?;
    let source_file = reader.by_name(&source_string).map_err(|error| anyhoo!(error))?;
    // To copy from within the same file, we need to borrow and mutably borrow the underlying Cursor
    // at the same time, hence the need for unsafe code.
    let source_file_copy: ZipFile = unsafe {
        transmute_copy(&source_file)
    };
    drop(source_file);
    unsafe {
        replace(writer, ZipWriter::new_append(file_so_far).map_err(|error| anyhoo!(error))?);
    }
    writer.raw_copy_file_rename(source_file_copy, &*dest_string).map_err(|error| anyhoo!(error))?;
    drop(zip);

This could probably be made much more efficient with a method dedicated to the same-zip-file case, which would shallow-copy the file entry so that the same copy of its compressed contents would be written twice.

Pr0methean commented 1 year ago

The workaround I used can be seen in its full context at https://github.com/Pr0methean/OcHd-RustBuild/blob/010f5049fc216532598e1cd56041a065e74b76b6/main/src/image_tasks/png_output.rs#L47

Pr0methean commented 1 year ago

I've created a fork at https://crates.io/crates/zip_next that adds this feature and uses only safe code. It makes a shallow copy (i.e. updates only the central directory).

Pr0methean commented 1 year ago

Also implemented a deep copy in 0.6.7, provided the underlying storage implements Read.

Plecra commented 1 year ago

Yeah, definitely don't use the unsafe code above :) that's entirely unsound. Awesome job making this API in zip_next! It's a nice optimization. I don't think I'm going bring the API into this crate as-is, but I'll be keeping this usecase in mind