zopfli-rs / zopfli

A Rust implementation of the Zopfli compression algorithm.
Apache License 2.0
32 stars 4 forks source link

Possible regression and possible fix #38

Open andrews05 opened 4 months ago

andrews05 commented 4 months ago

This is in reference to https://github.com/shssoichiro/oxipng/issues/579 where a regression was reported with zopfli v0.8.0 (in conjunction with Rust 1.74) when compressing some images. I've found this can be resolved by adding a single line in lib.rs:

        Format::Zlib => {
            let mut zlib_encoder = ZlibEncoder::new_buffered(options, BlockType::Dynamic, out)?;
+           let mut in_data = std::io::BufReader::new(in_data);
            std::io::copy(&mut in_data, &mut zlib_encoder)?;
            zlib_encoder.into_inner()?.finish().map(|_| ())
        }

Why this changes anything I have no idea. It's possible it's not so much a regression as just noise but I haven't tested on many files to get an idea of the average case. So I'm not sure whether the above is actually an appropriate change to make.

For a faster test case than the one originally reported, try using issue-29.png in the oxipng test files.

The "bad" output from oxipng 9:

$ oxipng --zopfli --pretend tests/files/issue-29.png 
Processing: tests/files/issue-29.png
214347 bytes (5.52% smaller): Running in pretend mode, no output

The "good" output from oxipng 8 or with the above change applied:

$ oxipng --zopfli --pretend tests/files/issue-29.png 
Processing: tests/files/issue-29.png
214229 bytes (5.57% smaller): Running in pretend mode, no output
andrews05 commented 4 months ago

Thinking about this further, I realise this should probably be on the caller to wrap it if appropriate and not zopfli's responsibility at all.

Also, I found the BufReader incurs a performance penalty. Using Box instead gives the same output without the speed loss.

kornelski commented 4 months ago

I suspect it's a bug that buffering affects the output at all. I don't recall anything in the underlying C API that requires feeding the library larger chunks of data – block boundaries and flushing are handled independently of the input buffer size.

andrews05 commented 4 months ago

I don't think it's actually about buffering. I just happened to try the BufReader initially but wrapping the &[u8] in any Read implementer seems to resolve the issue, such as Box, Cursor, or a custom Read struct that does nothing but wrap the &[u8] (the HashingAndCountingRead served this purpose prior to v0.8). It doesn't seem to affect the zopfli cli tool where the Readable is File. Perhaps the Read implementation for &[u8] changed in Rust 1.74?

[edit] I've been doing some benchmarks and found a clear trend of smaller files with the change, despite some that increased in size by a tiny amount. So I don't think it's just noise.

AlexTMjugador commented 4 months ago

Coincidentally, std::io::copy implementation changed in Rust 1.73 in ways that might affect how uncompressed data chunks are read from the source for writing to the compressor: https://github.com/rust-lang/rust/commit/6f8ba511fad774d5d55456c7a60d8bf27a194053

From what I've seen, Zopfli can be sensitive to how the input data is chunked, so that would explain the size differences. I don't expect such a behavior change to be a deal-breaker for using Zopfli, as any regressions or improvements should be minor, at least for the most part.

I'd like to trace calls to the C equivalent of the deflate_part function to confirm whether that optimization is indeed the root cause of this, whether I made some logic error somewhere in the implementation, and how can we improve from here.

andrews05 commented 4 months ago

Ah, the copy change sounds like a likely suspect.

For reference, the greatest difference I've observed so far was 0.8% on a ~350k (compressed size) png.