libjxl / libjxl

JPEG XL image format reference implementation
BSD 3-Clause "New" or "Revised" License
2.73k stars 268 forks source link

Lossless 1 bit slower than Djvulibre #3775

Open incoheart opened 3 months ago

incoheart commented 3 months ago

Describe the bug Cjxl takes about half a minute to compress a binary image. DjvuLibre is decades old and can compress a binary image in less than a second with superior compression.

To Reproduce You can compare the results on a scan or digital linework with ImageMagick, DjVuLibre, and cjxl. The example photo is 2544x3285. It is a 300DPI scan of an ink drawing. DjVuLibre compresses that image in 1 second with cjb2 from DjvuLibre, but cjxl compresses it at a similar size in 47 seconds.

Generating monochrome images from scan with ImageMagick: magick scanned_image.png -monochrome monochrome_image.png magick scanned_image.png -monochrome monochrome_image.pbm

Compressing with cjxl in 47 seconds: cjxl -d 0 -e 9 monochrome_image.png monochrome_image_47_seconds.jxl

Compressing with cjxl in 81 seconds: cjxl -d 0 -e 9 -E 3 monochrome_image.png monochrome_image_81_seconds.jxl

Compressing with cjxl in 4 seconds: cjxl -d 0 monochrome_image.png monochrome_image_4_seconds.jxl

Compressing in 1 second with cjb2 from DjVuLibre: cjb2 monochrome_image.pbm monochrome_image_1_second.djvu

LibJXL command size time
cjxl -d 0 -e 9 -E 3 79314 B 81 seconds
cjxl -d 0 -e 9 79314 B 47 seconds
cjxl -d 0 100408 B 4 seconds
DjVuLibre command size time
cjb2 76096 B <1 second

Expected behavior In my example photo it should compress down to ~75KB in a few seconds. It should take a small amount of computing time to compress a 1 bit binary image. Since DjvuLibre has been doing this well for years, JXL should be competitive to that.

Screenshots Scanned image source is an example of what artists might wish to transmit over the web when JXL becomes the new web standard. This is before it is formatted as binary using ImageMagick. scanned_image

Monochrome image that is to be compressed with cjxl and cjb2. This final image is 94KB in JXL after four seconds of compression and 75KB in DjVuLibre after 1 second of compression. monochrome_image

jonnyawsom3 commented 3 months ago

Got it down to 77565 B using cjxl Web.png Web.jxl -d 0 -g 3 -I 100 -P 0 -e 10 in 11 seconds, but still a bit off...

jonsneyers commented 3 months ago

JBIG2 (the underlying codec here) is specifically designed for bitonal (1-bit) images. It has coding tools specifically for that, and it can only do that. This means it does not have to take into account the possibility of any other sample format, and it can internally represent 1-bit images as actual bitmaps (i.e. using 1 bit per pixel).

The current libjxl implementation does not have any specialized code paths for lower sample precisions than the maximum precision allowed by the spec, which is 32-bit. So it will internally use 32 bits per pixel, even if the actual image only has 1-bit precision. Also the encoder heuristics (for patch detection, MA tree learning, etc) are generic and not at all tuned or specialized for the 1-bit case.

It's a bit similar to the comparison with lossless WebP: that format is limited to 8-bit RGBA so the libwebp implementation is hardcoded for that and can be faster and better than the current libjxl implementation which is using 32-bit for everything — so it uses 4 times as much memory for internal buffers, which has implications also for speed and amount of SIMD parallelism. (notable exception is fast-lossless e1, which is specialized for the 8 to 16-bit case, at least at the encode side)

There is nothing really that prevents us from making specialized encode/decode paths for the 1-bit case, which could be substantially faster and probably also perform better in terms of compression than what we have now. To make it as effective as possible, we would need to add a new pixel format to the libjxl API though, since currently UINT8 is the most compact pixel format we have in the API (one byte per pixel). I'm not sure if this should be a big priority though. While 1-bit images have been ubiquitous in the past (most scanners and faxes used it), I think moving forward, in most use cases, you don't really want to threshold your images to 1-bit but have some shades of gray too, if only to anti-alias the edges.

jonsneyers commented 3 months ago

With some manual hacking I found a way to improve compression a little, enough to beat JBIG2 on this image:

JPEG XL encoder v0.11.0 637806b1 [NEON]
Encoding [Modular, lossless, effort: 10]
channel 0: 2544x3285, range 0 .. 1
Doing one squeeze
channel 0: 1272x3285, range 0 .. 1
channel 1: 1272x3285, range -1 .. 1
Channels 0-1 can be represented using a 4-color palette. 1  Color 0 :  1 0   Color 1 :  1 1   Color 2 :  0 -1   Color 3 :  0 0 channel 0: 4x2, range -1 .. 1
channel 1: 1272x3285, range 0 .. 3
Compressed to 71700 bytes (0.069 bpp).
2544 x 3285,  0.668 MP/s [0.67, 0.67], , 1 reps, 12 threads.

The trick here was to apply one squeeze step followed by a palette step, which has the effect of turning the 1-bit image into a 2-bit image of half the size. This image compresses better since the MA tree gets a larger effective local neighborhood for context, etc.

m jxl

jonnyawsom3 commented 3 months ago

Do you think the "Palette Squeeze" could be applied to any <=4 bit input? Or is this a specialised case for bitonal images

jonsneyers commented 3 months ago

This trick could be applied to higher color counts too, and might be effective on e.g. low-color palette images too (e.g. pixel art with only 4 or 16 colors). I have no idea how effective it would be for compression though. Probably depends a lot on the specifics of the image content...

incoheart commented 3 months ago

I am interested to know if it works well compressing on these two different images.

Here is a sketch with in-between grays that cjxl already beats compared to the JBIG2 feature in cpaldjvu from DjvuLibre. 22685 B djvu and 12975 B as JXL. It is already good and fast and superior in the encoder I have available to me; can the hack do better?

sketch

cjxl -d 0 -e 9 -I 100 -g 3 -P 1 sketch.png sketch.jxl JPEG XL encoder v0.8.3 [AVX2,SSE4,SSSE3,SSE2] Read 600x600 image, 31869 bytes, 70.9 MP/s Encoding [Modular, lossless, effort: 9], Compressed to 12975 bytes (0.288 bpp). 600 x 600, 0.59 MP/s [0.59, 0.59], 1 reps, 4 threads.

By the way, this command here is even smaller, 12837 bytes instead of 12975 bytes if it starts with PGM instead of PNG. That seems like another bug!

Saving it starting from PGM instead of PNG:

cjxl -d 0 -e 9 -I 100 -g 3 -P 1 sketch.pgm sketch.jxl JPEG XL encoder v0.8.3 [AVX2,SSE4,SSSE3,SSE2] Read 600x600 image, 360014 bytes, 736.1 MP/s Encoding [Modular, lossless, effort: 9], Compressed to 12837 bytes (0.285 bpp). 600 x 600, 0.17 MP/s [0.17, 0.17], 1 reps, 4 threads.

Anyway! This second image is an example of webp performing better than JXL for palettes. It has 220 colors. This happens a lot when I use GIMP's positioned dither, which is very attractive when creating an indexed image to share on the web.

977709 B JXL (74 seconds of compression time) 903070 B webp (3 seconds of compression time)

The commands for reference,

JXL cjxl -d 0 -e 9 20240420_because_its_your_obsession.png 20240420_because_its_your_obsession.jxl JPEG XL encoder v0.8.3 [AVX2,SSE4,SSSE3,SSE2]
Read 2000x2000 image, 1089679 bytes, 113.2 MP/s
Encoding [Modular, lossless, effort: 9],
Compressed to 977709 bytes (1.955 bpp).
2000 x 2000, 0.04 MP/s [0.04, 0.04], 1 reps, 4 threads.

WEBP (I use ImageMagick) magick 20240420_because_its_your_obsession.png -define webp:lossless=true 20240420_because_its_your_obsession.webp

20240420_because_its_your_obsession

jonsneyers commented 3 months ago

Doing the 2x1 trick on the grayscale sketch.png image didn't give better results than not doing the trick — I think it probably only really helps for 1-bit images.

For the 220-color image with dithering: I would try avoiding dithering and applying compression on the actual original image. If you want to do lossy, there are better ways of doing that in jxl than just reducing the number of colors. The problem with dithering is that it can add a lot of entropy to something that would otherwise compress quite well (e.g. a smooth gradient).

Anyway, in the latest released version of libjxl, the result seems to be a little better already for that image:

cjxl t2.png -d 0 t2.jxl -e 10
JPEG XL encoder v0.10.3 0.10.3 [NEON]
Encoding [Modular, lossless, effort: 10]
Compressed to 917.9 kB (1.836 bpp).
2000 x 2000,  0.102 MP/s [0.10, 0.10], , 1 reps, 12 threads.
incoheart commented 3 months ago

Thank you so so much for testing that!

It would be cool to have a bash script or some way that the layman can access the 1-bit images hack. I would love to use it on my ink drawings and show my friends what the latest technology can do.

But well, thanks to the other suggestion and the response about lower bitdepth settings I have a fast command for binary images. It's nine times as slow as cjb2 from DjVuLibre and a little bit bigger. It's 77734 bytes in nine seconds compared to the earlier 79314 bytes in 47 seconds or 100408 bytes in four seconds. Reminder that DjVuLibre is 76096 bytes instantly and jonsneyers' hack is 71700 bytes in ??? seconds. (jonsneyers' hack is competitive with proprietary JBIG2, unlike DjVuLibre's cjb2 tool, which I've only been personally able to access using the tool pdfsizeopt for the PDF format.)

cjxl monochrome.png --override_bitdepth=1 -P 1 -g 3 -e 9 -I 100 -d 0 monochrome.jxl

and so to batch process my monochrome images during the next Inktober for sharing in a JXL supporting browser like PaleMoon I will use,

for i in *.png ; do cjxl --override_bitdepth=1 -P 1 -g 3 -e 9 -I 100 -d 0 "$i" "${i%.*}.jxl" ; done