Cap buffer sizes via `ZlibStream::set_max_total_output`.

anforowicz commented 10 months ago

PTAL? This results in quite a significant performance improvement for the noncompressed-8x8.png benchmark (bigger improvement than https://github.com/image-rs/image-png/pull/427 and https://github.com/image-rs/image-png/pull/428 combined - I am surprised).

anforowicz commented 10 months ago

Hmmm... I guess I need to look into the fuzzing failure. I suspect that IHDR can declare a smaller image size than the real image - I'll try to author a regression test that would do that.

anforowicz commented 10 months ago

Hmmm... I guess I need to look into the fuzzing failure. I suspect that IHDR can declare a smaller image size than the real image - I'll try to author a regression test that would do that.

This is done now. I am not sure if I need to do something special to kick-off the CI tests again?

anforowicz commented 10 months ago

@fintelia, can you PTAL?

fintelia commented 9 months ago

Haven't forgotten about this PR. But I've been wanting to finish https://github.com/image-rs/fdeflate/pull/14 first so that we don't have to reserve extra space just to make fdeflate happy

Update: fdeflate:0.3.2 is now published. It removes the requirements for extra buffer space in the output buffer

fintelia commented 9 months ago

Left a small comment, but otherwise looks good to me. As a followup I plan to experiment with changing ZlibStream to directly write into a caller provided buffer instead of using its own out_buffer.

anforowicz commented 9 months ago

Left a small comment, but otherwise looks good to me.

Ack. Thanks for reviewing!

As a followup I plan to experiment with changing ZlibStream to directly write into a caller provided buffer instead of using its own out_buffer.

Thanks for looking into this. Let me try to share a few notes, hoping that it helps (i.e. I am mostly trying to share knowledge and past experience, and definitely not trying to influence the changes one way or another - I don't claim to know which way is "right").

First of all - I note that there are interesting interdependencies between the data:

Final image doesn't have the "filter type" byte at the front of each row, but the raw decompressed data includes such byte.
Decompressor needs to retain 32kB for looking back.

My early attempts at a similar change can be found in https://github.com/anforowicz/image-png/commit/6071020bf83857ec27ef355cc74d23bd38030c43. (Although this is slightly different than what you are considering - here I am avoiding passing &mut Vec<u8> into ZlibStream's decompress and finish_compressed_chunks methods, but I still keep ZlibStream::out_buffer.) I've abandoned this attempt because:

I was using this as a prerequisite for an avoiding extra copies into prev row, but this copying has been taken care of with a different, simpler approach - https://github.com/image-rs/image-png/pull/422 (thanks!)
I was getting mixed performance results (probably because I approached this wrong - I was mixing and measuring multiple related/dependent changes, which made it difficult to understand which changes had a positive/negative impact).

At the same time, removing or replacing the &mut Vec<u8> parameter from ZlibStream's methods can result in some code simplification (and therefore it may be desirable even if it is performance-neutral). For example, there are some callers of ZlibStream::decompress which pass &mut vec![] or Vec::new (read_header_info and finish_decoding in mod.rs).

image-rs / image-png

Cap buffer sizes via `ZlibStream::set_max_total_output`. #429