JPEGXL lossless float16 is not lossless

cgohlke / imagecodecs

Image transformation, compression, and decompression codecs

https://pypi.org/project/imagecodecs

BSD 3-Clause "New" or "Revised" License

118 stars 23 forks source link

JPEGXL lossless float16 is not lossless #114

Open Skielex opened 1 week ago

Skielex commented 1 week ago

Hey,

First of all, really nice work on supporting so many image formats. I've been playing around with lossless encoding of some float16 data. However, when decoding there appears to be artifacts. I don't know if the issue is in your wrapper or in libjxl, however, before I raise an issue over there I just wanted bring it up here, since it's probably easy for you to check if the issue resides here.

The issue can be replicated on WSL2 Ubuntu Python 3.10 with imagecodecs==2024.9.22 using this test:


np.random.seed(42)
r_f16 = np.random.random((128, 128)).astype(np.float16)
r_f32 = r_f16.astype(np.float32)

r_jxl_f16 = jpegxl_encode(r_f16, lossless=True)
r_jxl_f32 = jpegxl_encode(r_f32, lossless=True)

r_jxl_f16_dec = jpegxl_decode(r_jxl_f16)
r_jxl_f32_dec = jpegxl_decode(r_jxl_f32)

np.testing.assert_equal(r_f32, r_jxl_f32_dec)  # Passes
np.testing.assert_equal(r_f16, r_jxl_f16_dec)  # Fails

Visually, it looks like this:

cgohlke commented 1 week ago

I can reproduce this on Windows, AMD64 and ARM64. It looks like a bug in libjxl to me. The glitches are quite rare though, maybe one in several thousand random values.

Try to reproduce this with the cjxl/djxl tools and open an issue at https://github.com/libjxl/libjxl/issues.

Skielex commented 1 week ago

Ok, glad to hear it's not just me. I'll try the tools.

Skielex commented 1 week ago

The cjxl/djxl tools don't appear to currently allow any formats that support float16, so I haven't been able to find a way to reproduce the issue using the tools.

The issue appears to be related to specific input values. For instance, the input value 3.07e-05 is consistently decoded as 3.6e-07.

Skielex commented 1 week ago

I tested all 65536 possible float16 values. It appears that 3070 (~4.7%) of these are changed by encode-decode. Interpreted as uint16, the four affected regions are: [(512, 1024), (31745, 32768), (33280, 33792), (64513, 65535)]. If we plot which bits are turned on, it looks like this:

cgohlke commented 1 week ago

currently allow any formats that support float16

Did you check EXR? Also,the GIMP plugin supports float16. Not sure, I have not tried either.

Skielex commented 1 week ago

Looks like EXR support has been dropped for the time being:

cjxl.exe INPUT OUTPUT [OPTIONS...]
 INPUT
    the input can be JXL, PPM, PNM, PFM, PAM, PGX, PNG, APNG, GIF, JPEG

https://github.com/libjxl/libjxl/issues/1662

Gonna try GIMP.

Skielex commented 1 week ago

Tried GIMP 2.99.19, which supports JPEG-XL. However, support appears a bit buggy.

Opening float16 or float32 jxl-files encoded using imagecodecs appears to work. float16 files contain the same artifacts that appear when decoding with imagecodecs, and float32 files appear correct, as when decoding with imagecodecs.

GIMP does not appear to support saving JPEG-XL as floating types.

I've already spent too much time on this, and there doesn't appear to be any good way of encoding float16 using libjxl outside creating my own set of bindings. I'll report the issue over at their repo and see what they think.

kmilos commented 6 days ago

FWIW, darktable can do fp16 EXR and TIFF. It should also decode any JXL.

For JXL encoding, 4.8.1 does just lossy floats, only the dev version can do fp32 lossless but crashes for fp16 lossless (also reported to libjxl; seems like there is no built-in fp32->fp16 down-conversion like there is when targeting integer lossless...).

kmilos commented 6 days ago

Looks like EXR support has been dropped for the time being

I have it here (MSYS2 package). Just means it was built w/o OpenEXR present for you. Doesn't mean it works though, haven't tested...

$ cjxl.exe
JPEG XL encoder v0.11.0 0.11.0-1 [AVX2,SSE4,SSE2]
Usage: C:\msys64\ucrt64\bin\cjxl.exe INPUT OUTPUT [OPTIONS...]
 INPUT
    the input can be JXL, PPM, PNM, PFM, PAM, PGX, PNG, APNG, GIF, JPEG, EXR

Skielex commented 5 days ago

@kmilos , thanks for chipping in. At this point, I'd be surprised if the error isn't in the encoder. Especially given the encoded size of a float16 256x256 image containing all 65536 possible values being only 826 bytes, while the encoded float32 image with the same values is 92 kb. I've attached the files in a zip.

images.zip

If you want, you're very welcome to try and open the f16.jxl file in Darktables and see if it's correct, or has the some artifacts. Then again, if you rely on libjxl, we wouldn't really learn anything new.

The zip also contains two EXR files that should contain the same 65536 values as the JXL files. If you've got cjxl/djxl working to EXR, perhaps you can check if they encode correctly.