Open kallisti5 opened 1 year ago
I feel like the solution is once ZLibDecoder reads the final 32k chunk, it should "back the source reader to the end of the zlib stream"?
This would enable users to know where the last compressed stream ended within a file.
https://github.com/rust-lang/flate2-rs/blob/main/src/deflate/read.rs#L161 feels like the source of the guaranteed 32k read from the files.
I think this is the same problem as #367 except that that issue is for gzip. Essentially, this is expected behavior for the read
interfaces that actually wrap the Read
type in a new std::io::BufReader
for each decoder.
To fix, wrap the File
in a BufReader
once and you can then pass it to multiple bufread::ZlibDecoder
instances.
This will, however, also make stream_position
incorrect, even if you can access it, so you will need a different way to know when to terminate the loop.
The flate2 ZLibDecoder seems to read too much data and advance file pointers too far.
Parsing a raw file filled with individual hunks of zlib compressed data:
dd if=sample/ctags_source-5.8-5-source.hpkg bs=1 skip=80 count=22137 of=test1
file test1 ; test1: zlib compressed data
dd if=sample/ctags_source-5.8-5-source.hpkg bs=1 skip=22217 count=22097 of=test2
file test2 ; test2: zlib compressed data
I can validate these chunks:
cat test1 | zlib-flate -uncompress > test1.uncompressed
cat test2 | zlib-flate -uncompress > test2.uncompressed
However.. when I try to decode these chunks with flate2 / ZlibDecoder..
read_exact produces 64k uncompressed as expected, but the file pointer is moved 32k in the file vs the expected 22057 bytes. (22057 marks the "end" of the compressed data stream and the start of the next 0x78, 0xDA)