Closed FvK91 closed 3 weeks ago
This is as expected for a gzip file with multiple members, which that one is. Showing the structure of that file with pigz -ltv
:
method check timestamp compressed original reduced name
gzip 8 d293a4bf ------ ----- 427 960 55.5% dummy
gzip 8 728ddacc ------ ----- 212 5876 96.4% <...>
gzip 8 4f4ecb61 ------ ----- 299 15712 98.1% <...>
2ea6e3a6 938 22548 95.8% (total)
There are three members. You simply need to keep decompressing with a new instance of zlib.decompressobj()
, or in C, using inflateReset()
for each member. From the documentation in zlib.h (always a good idea to read the documentation):
Unlike the gunzip utility and gzread() (see below), inflate() will not automatically decode concatenated gzip members. inflate() will return Z_STREAM_END at the end of the gzip member. The state would need to be reset to continue decoding a subsequent gzip member. This must be done if there is more data after a gzip member, in order for the decompression to be compliant with the gzip standard (RFC 1952).
Thanks Mark for the clear explanation. Much appreciated! Good to know I can use pigz to analyze gzip files in the future.
Hi, I'm having a problem when decompressing a file using zlib (v1.3.1) in which only the first line of the gz-archive is decompressed. After decompressing one 1 line, inflate returns Z_STREAM_END immediately.
Decompressing the archive using a tool like 7-zip works just fine.
The problem also occurs when using the zlib module in Python: only a single line is decompressed. When using the gzip python module everything works fine.
I have added 2 scripts and a dummy.gz file to reproduce the problem. problem_Z_STREAM_END.zip
Since I am able to successfully decompress the file with other tools/libraries I wonder if this is a bug in zlib or if it is expected behavior.