sstadick / gzp

Multi-threaded Compression
The Unlicense
155 stars 14 forks source link

Seamless writing of uncompressed output #53

Closed donovan-h-parks closed 1 year ago

donovan-h-parks commented 1 year ago

Hi,

Thank you for writing and maintaining gzp. I have found it extremely useful. I have run into one limitation with the interface. In my current application, I would like to optionally generate uncompressed output. This is especially true if the output is going to be written to stdout. However, with the current gzp interface this is challenging since a ParCompress object must call finish() before going out of scope. As such, the typical solution below doesn't work:

let mut seq_writer: Box<dyn Write + Send> = if compress {
    Box::new(
        ParCompressBuilder::new()
            .compression_level(flate2::Compression::new(compression_level as u32))
            .num_threads(processes)?
            .from_writer(seq_out),
    )
} else {
    Box::new(BufWriter::with_capacity(1024 * 1024, seq_out))
};

One can't call finish() in this situation. One solution would be to provide a ParCompress<NoCompress> type of something similar.

Is there another way to address this?

Thanks, Donovan

donovan-h-parks commented 1 year ago

Proposed solution on Stack Overflow which is wonderful, but I'll leave this open since a change to gzp might allow for a more elegant solution.

https://stackoverflow.com/questions/77137517/seamless-writing-of-uncompressed-output-when-using-the-rust-gzp-crate/77138543#77138543

sstadick commented 1 year ago

The stack overflow post was what I was going to suggest. I think abstracting this outside the library is appropriate in this case.

If you wanted to use your original Box approach though, contrary to what the docs say, you don't have to call .finish(). If finish hasn't been called when Drop is called, drop will call finish. The downside is that it will eat the errors and just panic if something went wrong.

donovan-h-parks commented 1 year ago

Thanks.