Closed GoogleCodeExporter closed 9 years ago
Very good find, thanks a lot! This is a very high priority issue - let me see
what I can do. I've been working off/on miniz over the past week so I'm going
to dive in later today. I've looked at the code and I see a solid solution that
involves refactoring this macro, but I need to profile it first. The solution
to fix the readahead problem is already there in the
TINFL_HUFF_BITBUF_FILL/TINFL_HUFF_DECODE macros, but I need to rethink how it's
optimized. Worst case, I need to eliminate the 2 byte readahead optimization
and just always punt to TINF_HUFF_BITBUF_FILL. Or, always read X bytes from the
input buf when possible, do the Huff decode, then actually update the
pointer/bits left/etc. Any issues in the core Deflate/Inflate code itself make
me drop everything. Crap.
The existing code should work OK for regular zlib-style (followed by byte
aligned adler32) or raw deflate streams if the user doesn't follow the stream
with any extra data they care about (gzip obviously does) and need to read
immediately following the compressed data (and they don't have any hints about
the compressed data's exact size). I tried to solve this problem but overlooked
this case. I'll also add it to my testing.
Original comment by richge...@gmail.com
on 18 Oct 2013 at 1:50
Another solution is to have the inflator put back the extra prefetched/unneeded
byte (decrementing the pIn_buf_cur ptr) after it finishes decoding (but before
the adler32 check). I'm always hesitant about "put back" style solutions
because I need to ensure it's always possible to put the byte back (without
doing a coroutine return). Also, telling the caller the inflator read 1 byte
too much seems so unelegant.
Original comment by richge...@gmail.com
on 18 Oct 2013 at 2:09
Thanks for looking into it! Good to know the solutions are readily available.
I hate to see 2-byte (4-byte at 64-bit) fast read optimization you put in goes
away, especially for the 4-byte batch read, but if profiling tells a different
story, it's up to you.
Since Miniz API always deals with buffer rather than reading from stream
directly, it's easy to move the read pointer back and forth to unread the
readahead bytes. I think num_bits already tracks how many bits not used in the
bit_buf. At the end of decoding, it can be checked to see whether it has 8
bits (16 bits or 32 bits for 64-bit case) and move the read pointer back by the
corresponding number of bytes.
Thanks again for Miniz. It's a great project.
Original comment by williamw...@gmail.com
on 18 Oct 2013 at 4:33
FYI if you are interested to know. I'm incorporating Miniz into the Rust
runtime library. It will be used as the basis for other compression libraries
that need the DEFLATE algorithm. Others like GZip or Zip will be using it.
The project https://github.com/williamw520/rustyzip is still at early stage.
Take a look at deflate.rs if you are interested in seeing how Miniz is being
used.
By the way while you are updating the Miniz project, can you add API for alloc
and free the tdefl_compressor structure and the tinfl_decompressor structure?
For non-C languages interfacing with Miniz, they probably don't know the size
of the structures to allocate in a platform independent way. For Rust, I have
to modify Miniz to add tdefl_alloc_compressor, tdefl_free_compressor,
tinfl_alloc_decompressor, and tinfl_free_decompressor to make it work. See the
attached file. Thanks!
Original comment by williamw...@gmail.com
on 18 Oct 2013 at 5:14
Attachments:
I checked out Rust on Wikipedia and it sounds awesome. Thanks for choosing
miniz and I'll do whatever I can to directly support you. We use miniz's core
compression code on lots of important stuff internally so any issues here must
be fixed.
I'm working on fixing the readahead problem right now. As these issues tend to
work out, the 2 byte lookahead optimization throws a larger monkeywrench into
the problem that I expected. I believe your workaround should be safe on all
zip/gzip streams though, because you know how many bytes will be expected to
follow the Deflate stream so the 2 byte readahead disable optimization (where
it switches into super conservative read 1 byte at a time mode) will always be
disabled near the end of the stream. But the readahead opt won't get disabled
if large amounts of bytes follow the end of the stream. I want to solve this in
a universal way/provably correct/high performance/conservative way (lots of
constraints).
So I can't just push back up to 1 byte - sometimes the Huff lookahead code will
see it only has 14 bits available, so it'll read 16 more bits, so it'll have 30
bits total. If the EOF sym is only 1 bit (rare but possible) we'll have read
ahead by 29 bits - ugh. I believe putting back 1 or 2 bytes is provably always
possible, but any more than this makes me nervous.
Historically, most other decompressors I've written didn't care about being
super conservative about not reading beyond the end because the "end" was
*always* the last byte of compressed data (and we stored the exact length of
the compressed stream somewhere), and perf. was always a higher priority than
pure streaming.
Original comment by richge...@gmail.com
on 19 Oct 2013 at 10:44
Oops, I meant to say that by you switching the decompressor into conservative
read-ahead mode when there's < 16 bytes (not < 2) should fix the gzip problem
OK (and can't break zlib streams). But as you pointed out, if more than this
amount follows the compressed data there are still readhead problems.
Original comment by richge...@gmail.com
on 19 Oct 2013 at 11:06
OK I believe I've found a proper solution that should always work that doesn't
require getting rid of the lookahead optimization. They key thing is to undo
the lookahead at both places the inflator can return to the caller: either
during a normal return (when the input or output buffer is exhausted), or after
the last block/before the optional adler32 check. This prevents the tricky case
where the decompressor read ahead too much (because it thought it was safe to
do because there was still plenty of bytes remaining in the inbuf), did a
coroutine return to flush the output buffer, and the caller sets the
pIn_buf_next==pIn_buf_cur ptrs to equal on the final call (making it impossible
to "push back" any readahead bytes before the final return).
I've added a randomized test to miniz_tester.cpp to help ensure each case gets
coverage (zlib or raw deflate+random amounts of user data from 0-X bytes).
Original comment by richge...@gmail.com
on 19 Oct 2013 at 1:29
Ok, I'm testing the real fixes right now. The merge is very tiny, but the
Inflator is complex and super dense so I need to do a bunch of automated
testing before releasing a beta. (And it'll be v1.16 beta because I am
terrified of breaking anything in this code..)
The workaround you are currently using is worrying me more. I'm wondering if
you've tested 64-bit - the optimized literal loop can load up to 32-bits at a
time here. There's a check in the literal loop which determines if it can be
used:
if (((pIn_buf_end - pIn_buf_cur) < 4) || ((pOut_buf_end - pOut_buf_cur) < 2))
For max safety I would change this to be ((pIn_buf_end - pIn_buf_cur) < 16) in
your case until I the new version is checked in (tonight or tomorrow I hope -
if testing goes well).
-Rich
Original comment by richge...@gmail.com
on 19 Oct 2013 at 3:34
Here's an alpha release of v1.16 beta with the fixes if you would like to try
them: http://www.tenacioussoftware.com/miniz.c
Note I'm still testing these fixes so a major flaw could be present, etc.
Please let me know how it works out.
I'll need to update tinfl.c too once it's ready.
Original comment by richge...@gmail.com
on 19 Oct 2013 at 3:54
Testing so far seems good..
Original comment by richge...@gmail.com
on 19 Oct 2013 at 7:45
Fantastic! The fix is universal to handle all cases. That's great.
The workaround was temporary and meant to confirm the theory on cause of the
problem in the fast path. I'll take the new code once it's released.
Also can you merge in the allocator/free changes? Those are really helpful for
non-C languages to use Miniz. Actually let me file a separate merge request
for that.
Thanks again.
Original comment by williamw...@gmail.com
on 19 Oct 2013 at 10:25
I've updated the version I uploaded earlier today:
http://www.tenacioussoftware.com/miniz.c
This is very close to what I'm going to release as a beta. I want to get your
alloc additions in too but this fix seems much more important to you. I'll be
testing this code more over the next week at home and work, but I'm feeling
pretty good about it. The changes are pretty conservative, the diff is tiny,
and they don't sacrifice any perf.
Original comment by richge...@gmail.com
on 19 Oct 2013 at 10:36
Ok, I'm adding one more change that will make the inflator much more resilient
in the face of corrupted or truncated input and when the caller has no idea how
many output bytes to expect. If the caller knows the max # of output bytes to
expect, it can eventually stop with an error if it receives too many output
bytes. This is always how I've used the inflator (and is the safest way to use
it), and the inflator is already pretty resilient against corrupted input (and
has been fuzz tested) but it can't hurt.
Right now the existing inflator just padds the end of the input with endless 0
bytes and hopes the caller will eventually get tired of possibly receiving
large amounts of uncompressed data (worst case scenario). It should have
probably padded with 0xFF's because the EOF Huff code is the largest and is
usually (always?) all 1's, but I'm removing this ugliness entirely.
TINFL_STATUS_FAILED_CANNOT_MAKE_PROGRESS (-4) will be returned in v1.16 if the
inflator attempts to read beyond the end of the input *and* the caller has
indicated there is no more input. It will still return the usual
TINFL_STATUS_NEEDS_MORE_INPUT if you indicate that more input is available.
You can try calling it again if TINFL_STATUS_FAILED_CANNOT_MAKE_PROGRESS is
returned, but if you receive this code it's likely either the compressed data
is invalid or the caller somehow screwed something up. (Or you are doing
something fancy with streaming and don't even know if there's more input yet,
which should be a supportable scenario now.)
I'm going to put these changes through much more testing and re-upload soon.
Original comment by richge...@gmail.com
on 19 Oct 2013 at 11:38
Yes, I'll merge in your allocator changes too.
BTW - With these 2 fixes the inflator's should be less CPU code than before.
Original comment by richge...@gmail.com
on 19 Oct 2013 at 11:41
Thank you! This is great. With the correct byte consumption at decompress,
Miniz will be perfect for on-the-fly stream compression, like compressing and
decompressing network data while writing or reading them on the wire. This
would work on HTTP response streaming compression. A big thank you.
Original comment by williamw...@gmail.com
on 20 Oct 2013 at 12:45
These fixes are in v1.16 beta:
http://www.tenacioussoftware.com/miniz_v116_beta_r1.7z
Also updated the main page. I'll make this a non-beta release after more
file/fuzz testing, but I think they are good.
Original comment by richge...@gmail.com
on 20 Oct 2013 at 6:01
Original comment by richge...@gmail.com
on 20 Oct 2013 at 6:01
Just letting you know that I've tested 1.16 beta on over 2 million files (with
random settings and random flushes) so far - looks OK. I'm going to merge it
over to my zip64 branch tomorrow and continue pounding on it.
I'm torn about zip64 - the amount of extra complexity is making me sad. It'll
have to be split apart from the main part of miniz.c into a new file
(miniz_zip.c?) - it's just too much code. But it's nice to be able to make
enormous archives of any size file and not worry.
Original comment by richge...@gmail.com
on 21 Oct 2013 at 4:38
Original issue reported on code.google.com by
williamw...@gmail.com
on 18 Oct 2013 at 10:01Attachments: