jongiddy commented 2 years ago

If the gzip header contains an optional filename or comment but they are not completely contained in the first buffer sent to a write::GzDecoder, then a valid header is created, missing data from these optional sections. A subsequent write call with the remaining header will treat the remaining header as encoded data and attempt to decode it, making the file appear to be corrupt.

This change rewrites the header parsing code to handle partial headers correctly for both Read (where WouldBlock is handled specially) and Write (where UnexpectedEof is handled specially).

The parsing code moves from bufread.rs to mod.rs to emphasize that it is also used by write.rs. write.rs no longer uses any symbols from bufread.rs.

The only test in bufread.rs tested Buffer which is no longer needed, so it was removed. The support code in the bufread tests module was used by a test in mod.rs which tests a read::GzDecoder so the support code and the test move to read.rs.

Fixes #320

jongiddy commented 2 years ago

After fixing the bug, it took a few iterations to get the size of GzDecoder back to its previous size. On 64-bit, write::GzDecoder<&mut [u8]> is now 216 bytes (previously 224 bytes). Moving the header into bufread::GzState makes bufread::GzDecoder<&[u8]> 176 bytes (previously 296 bytes).

jongiddy commented 1 year ago

344 #345 and #346 are simpler PR's performing some of the code changes in this PR. If they are merged, then this PR should be simpler to review.

jongiddy commented 1 year ago

Thanks for merging. It is a pleasant surprise when a well-aged PR finally gets merged. This patch has been running with no problems in our infrastructure for 9 months, so I'm reasonably confident that it is an improvement with no major downsides.

rust-lang / flate2-rs

Fix GzDecoder Write partial filenames and comments #323

344 #345 and #346 are simpler PR's performing some of the code changes in this PR. If they are merged, then this PR should be simpler to review.