image-rs / image

Encoding and decoding images in Rust

Apache License 2.0

4.76k stars 589 forks source link

High compression ratio mode for lossless WebP #2221

Open Shnatsel opened 2 months ago

Shnatsel commented 2 months ago

WebP remains the lossless format with the highest compression ratio that is widely adopted.

"Lossless" AVIF is not truly lossless due to the use of a custom non-RGB colorspace, conversion into which loses some data, while JPEG XL is still far from widespread adoption.

1978 has added support for writing lossless WebP. While the current code is fast, the compression ratio is not very high.

It would be great to implement more compression strategies from libwebp, and provide an option to use them and trade some CPU time for actually utilizing the high compression ratio the format can provide.

fintelia commented 2 months ago

To be clear "not very high" is relative to what WebP is capable of. I just did a quick benchmark using the image from https://github.com/image-rs/image-webp/issues/71. I converted the image to lossless form so everything would be using an identical baseline and then compressed with a bunch of different encoders:

Compression Ratio	Time	Encoder
51%	0.2 seconds	image-png (fast)
45%	11 seconds	image-png (best)
42%	24 seconds	oxipng
38%	0.2 seconds	image-webp
32%	14 seconds	cwebp (default lossless settings)

Of course, using only a single image doesn't tell the whole story. image-webp's hard-coded encoding parameters make it vulnerable to edge cases and so forth, but it shows that the encoder is overall quite respectable.

fintelia commented 2 months ago

The key change needed to achieve higher compression is to sweep over compression parameters and select the set that produces the best output. libwebp uses various heuristics that cut down on the number of candidate parameter options to pick from, so that's probably worth looking into. Heuristics are particularly important for the faster compression settings that don't have time to consider all the options

As far as changes to the encoder itself, the biggest is probably added support for varying the huffman codes used across the image. Since storing the huffman trees takes up space, you have to bin regions of the image that have similar entropy distributions to reduce the total number of trees used. At the moment I don't fully understand this process for deciding which regions should share the same codes