Decompress consumes an extra byte when extra data are provided

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?

1. Provide extra data in the input buffer for tinfl_decompress().
2. An extra byte is consumed beyond the deflated data.
3. The output parameter pIn_buf_size of tinfl_decompress() has an extra byte.

This is a problem for gzip stream data since it has 8 extra bytes after the 
deflated data (4 bytes CRC and 4 bytes original length).  When reading gzip 
stream data, it's often not known when the end of stream is.  One has to read 
as much as data there are and pass them to tinfl_decompress() to let it decode 
the deflated data.  Once tinfl_decompress() stops, the remaining data in the 
buffer are interpreted separately by the caller.

What is the expected output? What do you see instead?

Expect decompress() to consume exactly the number of deflated data, or if an 
extra byte is consumed, let the caller knows so that it can unread the byte in 
the buffer.

What version of the product are you using? On what operating system?
version 1.15.  On Win32.

Please provide any additional information below.

Sample gzip deflated data causing the problem.

[0x73, 0x74, 0x72, 0x76, 0x71, 0x75, 0x73, 0xF7, 0xE0, 0xE5, 0x02, 0x00,
  0x94, 0xA6, 0xD7, 0xD0,  0x0A, 0x00, 0x00, 0x00]

The first 12 bytes are the deflated data.  The next 4 bytes are the CRC.  The 
last 4 bytes are the original length.  tinfl_decompress() should stop after 12 
bytes.

The original data is "ABCDEFGH\r\n"

--------

I found the problem where it's too late to call TINFL_HUFF_BITBUF_FILL().  It 
is only called when only 1 byte are left in the input buffer, which is never 
triggered for the above case where there are 8 extra bytes.

I have changed the number of bytes to check from ((pIn_buf_end - pIn_buf_cur) < 
2) to ((pIn_buf_end - pIn_buf_cur) < 16) and that works.  See attached file for 
the diffs.

It's not an airtight fix since it only helps the cases with input buffer upto 
15 extra bytes.  It makes miniz work on gzip files.

The best fix would be to consume the extra byte as needed but fix up the 
pointer by counting how many bits left in bit_buf.  Since the data are in a 
buffer, it's easy to move the pointer back as needed.

Original issue reported on code.google.com by williamw...@gmail.com on 18 Oct 2013 at 10:01

Attachments:

miniz_fix.txt

GoogleCodeExporter commented 9 years ago

Very good find, thanks a lot! This is a very high priority issue - let me see 
what I can do. I've been working off/on miniz over the past week so I'm going 
to dive in later today. I've looked at the code and I see a solid solution that 
involves refactoring this macro, but I need to profile it first. The solution 
to fix the readahead problem is already there in the 
TINFL_HUFF_BITBUF_FILL/TINFL_HUFF_DECODE macros, but I need to rethink how it's 
optimized. Worst case, I need to eliminate the 2 byte readahead optimization 
and just always punt to TINF_HUFF_BITBUF_FILL. Or, always read X bytes from the 
input buf when possible, do the Huff decode, then actually update the 
pointer/bits left/etc. Any issues in the core Deflate/Inflate code itself make 
me drop everything. Crap.

The existing code should work OK for regular zlib-style (followed by byte 
aligned adler32) or raw deflate streams if the user doesn't follow the stream 
with any extra data they care about (gzip obviously does) and need to read 
immediately following the compressed data (and they don't have any hints about 
the compressed data's exact size). I tried to solve this problem but overlooked 
this case. I'll also add it to my testing.

Original comment by richge...@gmail.com on 18 Oct 2013 at 1:50

Changed state: Accepted

GoogleCodeExporter commented 9 years ago

Another solution is to have the inflator put back the extra prefetched/unneeded 
byte (decrementing the pIn_buf_cur ptr) after it finishes decoding (but before 
the adler32 check). I'm always hesitant about "put back" style solutions 
because I need to ensure it's always possible to put the byte back (without 
doing a coroutine return). Also, telling the caller the inflator read 1 byte 
too much seems so unelegant.

Original comment by richge...@gmail.com on 18 Oct 2013 at 2:09

GoogleCodeExporter commented 9 years ago

Thanks for looking into it!  Good to know the solutions are readily available.

I hate to see 2-byte (4-byte at 64-bit) fast read optimization you put in goes 
away, especially for the 4-byte batch read, but if profiling tells a different 
story, it's up to you.

Since Miniz API always deals with buffer rather than reading from stream 
directly, it's easy to move the read pointer back and forth to unread the 
readahead bytes.  I think num_bits already tracks how many bits not used in the 
bit_buf.  At the end of decoding, it can be checked to see whether it has 8 
bits (16 bits or 32 bits for 64-bit case) and move the read pointer back by the 
corresponding number of bytes.

Thanks again for Miniz.  It's a great project.

Original comment by williamw...@gmail.com on 18 Oct 2013 at 4:33

GoogleCodeExporter commented 9 years ago

FYI if you are interested to know.  I'm incorporating Miniz into the Rust 
runtime library.  It will be used as the basis for other compression libraries 
that need the DEFLATE algorithm.  Others like GZip or Zip will be using it.

The project https://github.com/williamw520/rustyzip is still at early stage.  
Take a look at deflate.rs if you are interested in seeing how Miniz is being 
used.

By the way while you are updating the Miniz project, can you add API for alloc 
and free the tdefl_compressor structure and the tinfl_decompressor structure?  
For non-C languages interfacing with Miniz, they probably don't know the size 
of the structures to allocate in a platform independent way.  For Rust, I have 
to modify Miniz to add tdefl_alloc_compressor, tdefl_free_compressor, 
tinfl_alloc_decompressor, and tinfl_free_decompressor to make it work.  See the 
attached file.  Thanks!

Original comment by williamw...@gmail.com on 18 Oct 2013 at 5:14

Attachments:

miniz_addition.c

GoogleCodeExporter commented 9 years ago

I checked out Rust on Wikipedia and it sounds awesome. Thanks for choosing 
miniz and I'll do whatever I can to directly support you. We use miniz's core 
compression code on lots of important stuff internally so any issues here must 
be fixed.

I'm working on fixing the readahead problem right now. As these issues tend to 
work out, the 2 byte lookahead optimization throws a larger monkeywrench into 
the problem that I expected. I believe your workaround should be safe on all 
zip/gzip streams though, because you know how many bytes will be expected to 
follow the Deflate stream so the 2 byte readahead disable optimization (where 
it switches into super conservative read 1 byte at a time mode) will always be 
disabled near the end of the stream. But the readahead opt won't get disabled 
if large amounts of bytes follow the end of the stream. I want to solve this in 
a universal way/provably correct/high performance/conservative way (lots of 
constraints).

So I can't just push back up to 1 byte - sometimes the Huff lookahead code will 
see it only has 14 bits available, so it'll read 16 more bits, so it'll have 30 
bits total. If the EOF sym is only 1 bit (rare but possible) we'll have read 
ahead by 29 bits - ugh. I believe putting back 1 or 2 bytes is provably always 
possible, but any more than this makes me nervous.

Historically, most other decompressors I've written didn't care about being 
super conservative about not reading beyond the end because the "end" was 
*always* the last byte of compressed data (and we stored the exact length of 
the compressed stream somewhere), and perf. was always a higher priority than 
pure streaming.

Original comment by richge...@gmail.com on 19 Oct 2013 at 10:44

GoogleCodeExporter commented 9 years ago

Oops, I meant to say that by you switching the decompressor into conservative 
read-ahead mode when there's < 16 bytes (not < 2) should fix the gzip problem 
OK (and can't break zlib streams). But as you pointed out, if more than this 
amount follows the compressed data there are still readhead problems.

Original comment by richge...@gmail.com on 19 Oct 2013 at 11:06

GoogleCodeExporter commented 9 years ago

OK I believe I've found a proper solution that should always work that doesn't 
require getting rid of the lookahead optimization. They key thing is to undo 
the lookahead at both places the inflator can return to the caller: either 
during a normal return (when the input or output buffer is exhausted), or after 
the last block/before the optional adler32 check. This prevents the tricky case 
where the decompressor read ahead too much (because it thought it was safe to 
do because there was still plenty of bytes remaining in the inbuf), did a 
coroutine return to flush the output buffer, and the caller sets the 
pIn_buf_next==pIn_buf_cur ptrs to equal on the final call (making it impossible 
to "push back" any readahead bytes before the final return).

I've added a randomized test to miniz_tester.cpp to help ensure each case gets 
coverage (zlib or raw deflate+random amounts of user data from 0-X bytes).

Original comment by richge...@gmail.com on 19 Oct 2013 at 1:29

GoogleCodeExporter commented 9 years ago

Ok, I'm testing the real fixes right now. The merge is very tiny, but the 
Inflator is complex and super dense so I need to do a bunch of automated 
testing before releasing a beta. (And it'll be v1.16 beta because I am 
terrified of breaking anything in this code..)

The workaround you are currently using is worrying me more. I'm wondering if 
you've tested 64-bit - the optimized literal loop can load up to 32-bits at a 
time here. There's a check in the literal loop which determines if it can be 
used:

if (((pIn_buf_end - pIn_buf_cur) < 4) || ((pOut_buf_end - pOut_buf_cur) < 2))

For max safety I would change this to be ((pIn_buf_end - pIn_buf_cur) < 16) in 
your case until I the new version is checked in (tonight or tomorrow I hope - 
if testing goes well).

-Rich

Original comment by richge...@gmail.com on 19 Oct 2013 at 3:34

GoogleCodeExporter commented 9 years ago

Here's an alpha release of v1.16 beta with the fixes if you would like to try 
them: http://www.tenacioussoftware.com/miniz.c

Note I'm still testing these fixes so a major flaw could be present, etc. 
Please let me know how it works out.

I'll need to update tinfl.c too once it's ready.

Original comment by richge...@gmail.com on 19 Oct 2013 at 3:54

GoogleCodeExporter commented 9 years ago

Testing so far seems good..

Original comment by richge...@gmail.com on 19 Oct 2013 at 7:45

GoogleCodeExporter commented 9 years ago

Fantastic!  The fix is universal to handle all cases.  That's great.

The workaround was temporary and meant to confirm the theory on cause of the 
problem in the fast path.  I'll take the new code once it's released.

Also can you merge in the allocator/free changes?  Those are really helpful for 
non-C languages to use Miniz.  Actually let me file a separate merge request 
for that.

Thanks again.

Original comment by williamw...@gmail.com on 19 Oct 2013 at 10:25

GoogleCodeExporter commented 9 years ago

I've updated the version I uploaded earlier today:

http://www.tenacioussoftware.com/miniz.c
This is very close to what I'm going to release as a beta. I want to get your 
alloc additions in too but this fix seems much more important to you. I'll be 
testing this code more over the next week at home and work, but I'm feeling 
pretty good about it. The changes are pretty conservative, the diff is tiny, 
and they don't sacrifice any perf.

Original comment by richge...@gmail.com on 19 Oct 2013 at 10:36

GoogleCodeExporter commented 9 years ago

Ok, I'm adding one more change that will make the inflator much more resilient 
in the face of corrupted or truncated input and when the caller has no idea how 
many output bytes to expect. If the caller knows the max # of output bytes to 
expect, it can eventually stop with an error if it receives too many output 
bytes. This is always how I've used the inflator (and is the safest way to use 
it), and the inflator is already pretty resilient against corrupted input (and 
has been fuzz tested) but it can't hurt.

Right now the existing inflator just padds the end of the input with endless 0 
bytes and hopes the caller will eventually get tired of possibly receiving 
large amounts of uncompressed data (worst case scenario). It should have 
probably padded with 0xFF's because the EOF Huff code is the largest and is 
usually (always?) all 1's, but I'm removing this ugliness entirely. 

TINFL_STATUS_FAILED_CANNOT_MAKE_PROGRESS (-4) will be returned in v1.16 if the 
inflator attempts to read beyond the end of the input *and* the caller has 
indicated there is no more input. It will still return the usual 
TINFL_STATUS_NEEDS_MORE_INPUT if you indicate that more input is available. 

You can try calling it again if TINFL_STATUS_FAILED_CANNOT_MAKE_PROGRESS is 
returned, but if you receive this code it's likely either the compressed data 
is invalid or the caller somehow screwed something up. (Or you are doing 
something fancy with streaming and don't even know if there's more input yet, 
which should be a supportable scenario now.)

I'm going to put these changes through much more testing and re-upload soon.

Original comment by richge...@gmail.com on 19 Oct 2013 at 11:38

GoogleCodeExporter commented 9 years ago

Yes, I'll merge in your allocator changes too.

BTW - With these 2 fixes the inflator's should be less CPU code than before.

Original comment by richge...@gmail.com on 19 Oct 2013 at 11:41

GoogleCodeExporter commented 9 years ago

Thank you!  This is great.  With the correct byte consumption at decompress, 
Miniz will be perfect for on-the-fly stream compression, like compressing and 
decompressing network data while writing or reading them on the wire.  This 
would work on HTTP response streaming compression.  A big thank you.

Original comment by williamw...@gmail.com on 20 Oct 2013 at 12:45

GoogleCodeExporter commented 9 years ago

These fixes are in v1.16 beta:
http://www.tenacioussoftware.com/miniz_v116_beta_r1.7z

Also updated the main page. I'll make this a non-beta release after more 
file/fuzz testing, but I think they are good.

Original comment by richge...@gmail.com on 20 Oct 2013 at 6:01

GoogleCodeExporter commented 9 years ago

Original comment by richge...@gmail.com on 20 Oct 2013 at 6:01

Changed state: Fixed

GoogleCodeExporter commented 9 years ago

Just letting you know that I've tested 1.16 beta on over 2 million files (with 
random settings and random flushes) so far - looks OK. I'm going to merge it 
over to my zip64 branch tomorrow and continue pounding on it.

I'm torn about zip64 - the amount of extra complexity is making me sad. It'll 
have to be split apart from the main part of miniz.c into a new file 
(miniz_zip.c?) - it's just too much code. But it's nice to be able to make 
enormous archives of any size file and not worry.

Original comment by richge...@gmail.com on 21 Oct 2013 at 4:38

uuangian / miniz

Decompress consumes an extra byte when extra data are provided #23