rust-lang / flate2-rs

DEFLATE, gzip, and zlib bindings for Rust
https://docs.rs/flate2
Apache License 2.0
893 stars 160 forks source link

Continue reading a stream after ZlibDecoder streams finishes #401

Closed marxin closed 5 months ago

marxin commented 5 months ago

I'm implementing the parsing of the git pack file format as part of the coding challenge: https://app.codecrafters.io/courses/git/stages/7

It seems the git pack format is a binary file format where each object contains a header followed by a Zlib compressed stream. What's unpleasant one doesn't know the size of the compressed block. Is it possible to get back the underlying stream (with into_inner or get_mut) including the buffer data by ZlibDecoder so that I can carry on reading another object header?

jongiddy commented 5 months ago

This should work with bufread::ZlibDecoder. See this test for bufread::GzDecoder. This code modified for zlib should work the same, allowing trailing data to be read from the BufRead after calling into_inner().

Note that the same test does not work for read::GzDecoder and similarly I do not expect it to work with read::ZlibDecoder.

jongiddy commented 5 months ago

402 adapts the gzip test to demonstrate that this does also work for deflate and zlib BufRead decoders.

marxin commented 5 months ago

Thank you very much for the fast response! It's great the current bufread::ZlibDecoder works as I needed. I can confirm it works for me in my particular test-case.

Have 2 comments:

Byron commented 5 months ago

As the original question was answered with tests, I think it's fair to close this issue despite inviting for continuing the conversation here.

Regarding documentation, please feel free to open a PR with the improvement to the docs that you would have wanted to see. Maybe you can play around with ZlibDecoder and implementing BufRead on it as well. Maybe even more improvements arise from that :).

jongiddy commented 5 months ago

There is an existing discussion on why the bufread decoders do not implement BufRead.

jongiddy commented 5 months ago

The docs for bufread and write GzDecoder have text describing this behaviour. This can be copied to the docs for the other decoders.

marxin commented 5 months ago

The docs for bufread and write GzDecoder have text describing this behaviour.

Can you please send me a link to the behavior description? I can't find it :)

jongiddy commented 5 months ago

bufread: https://github.com/rust-lang/flate2-rs/blob/8a502a791fbcbdb56b20f6d6dcd7096f0c8f1a33/src/gz/bufread.rs#L171-L174 write: https://github.com/rust-lang/flate2-rs/blob/8a502a791fbcbdb56b20f6d6dcd7096f0c8f1a33/src/gz/write.rs#L174-L176

And there is an equivalent paragraph for the read decoder to say that this does not work: https://github.com/rust-lang/flate2-rs/blob/8a502a791fbcbdb56b20f6d6dcd7096f0c8f1a33/src/gz/read.rs#L97-L101