libjxl / libjxl

JPEG XL image format reference implementation
BSD 3-Clause "New" or "Revised" License
2.61k stars 251 forks source link

Reference API generates corrupt JXL files that cannot be decoded to JPEG anymore. #949

Closed schmorp closed 2 years ago

schmorp commented 2 years ago

Describe the bug

While cjxl, which uses an undocumented API, can create jxl files that allow jpeg reconstruction, the official (C) API currently (commit ea2612d6df99e9878b51e315935f9d6201f5fe47) generates corrupt files that can be viewed, but fail to reconstruct to the original JPEG files.

To Reproduce

I have an encoder (not written in C) that basically does this:

JxlEncoderCreate JxlEncoderUseContainer JxlEncoderStoreJPEGMetadata JxlEncoderFrameSettingsCreate JxlEncoderSetFrameLossless JxlEncoderAddJPEGFrame JxlEncoderCloseInput

That is, it doesn't set any options and relies on AddJPEGFrame to set defaults. The expectation is that it is possibe to reconstruct the original JPEG from the encoded file. The resulting jxl file is 283kb large and has the following structure:

box: type: "JXL " size: 12 box: type: "ftyp" size: 20 box: type: "jxlp" size: 22 box: type: "jbrd" size: 284 box: type: "jxlp" size: 282464

The image displays fine, i.e. it can be (pixel-)decoded without problems both by djxl as well as by my own image viewer that uses the reference API.

Trying to reconstruct the original JPEG with the C api fails with status 2, while trying to reconstruct using djxl fails with:

Read 282802 compressed bytes. Unknown compressed image format

Debugging the problem, it seems the Exif information is missing. Indeed, if I use cjxl, I get a different, larger file (316k), with the following structure, which reconstructs fine:

box: type: "JXL " size: 12 box: type: "ftyp" size: 20 box: type: "Exif" size: 8719 box: type: "jbrd" size: 284 box: type: "jxlc" size: 282470

The original JPEG has the following structure:

SOI(start_of_image) @0x00000000 APP0 @0x00000002 APP1 @0x00000014 APPE @0x00000148 DQT(quantization_table) @0x00000158 SOF0(huff_baseline) @0x0000019d DRI(restart_interval) @0x000001aa DHT(huff_table) @0x000001b0 SOS(start_of_scan) @0x00000284 EOI(end_of_image) @0x00054398

Expected behavior

  1. JPEG reconstruction should not fail for files encoded by the reference implementation.
  2. cjxl and djxl should use the same code to encode/decode as the public library API, and should result in identical results (given the same parameters). Using a special private API for cjxl/djxl will always carry the risk of it being different than the official API.
  3. There should be a testsuite for the official API to catch these type of bugs early.
schmorp commented 2 years ago

I forgot to mention that the encoder of course also calls JxlEncoderProcessOutput in a loop until it returns JXL_ENC_SUCCESS.

schmorp commented 2 years ago

As an experiment, I copied Exif info manually during encoding (using JxlEncoderAddBox), but this didn't seem to help (djxl can't decode the jxl file, even with -j, the reference library can't reconstruct the jpeg, but can decode the pixel data without error), so maybe it is not caused bymissing Exif info itself. (I "verified" that it does read the exif data, as I needed a few tries to get the format right, with the tiff_header_offrset at the beginning and missing exif magic).

jonsneyers commented 2 years ago

This should fix the djxl bug: https://github.com/libjxl/libjxl/pull/985

The api decoder bug still exists that it does not properly find the exif (and xmp, and jumbf) metadata for the purpose of jpeg reconstruction.

We are planning to migrate cjxl/djxl to use the API soon.

schmorp commented 2 years ago

Thanks for the quick turnaround.

I gave the patch a try, but the patch doesn't fix the problem in djxl, although it clearly goes into the right direction. With the patch, djxl now does a lossy decode/reencode (and gives better error messages):

./tools/box/box.cc:308: JXL_FAILURE: Exif box size does not match JPEG reconstruction data ./tools/djxl.cc:210: JXL_FAILURE: Failed to decode JXL to JPEG Warning: could not decode losslessly to JPEG. Retrying with --pixels_to_jpeg... Decoded to pixels.

In fact, this behaviour is quite a bit worse as before, as it doesn't return an error status while generating broken output, so it looks as if everything worked, even though data loss happened.

In my opinion, when jpeg reconstruction should be possible but fails, djxl should never automatically do a pixels-to-jpeg conversion, at least not without signalling an error.

schmorp commented 2 years ago

OH, sorry, I didn't understand - indeed, the patch seems to do fix everything but finding the metadata. The only concern would be djxl generating a "corrupt" (non-reconstructed lossy) jpeg file without signalling an error to the caller.

mo271 commented 2 years ago

If I'm not mistaken, the remaining concern is the same as the one brought up here: https://github.com/libjxl/libjxl/issues/950

schmorp commented 2 years ago

Not sure why this is closed, the reference api still generates undecodable files when transcoding from jpeg (as described by jonsneyers). The only difference is that corruption now happens silently, which is quite a bit worse than before, as reported in december.

But maybe I am wrong here - being able to transcode jpeg to jxl and back in my mind is supposedly (?) a major feature of jpeg-xl and and libjxl, yet jpeg trancoding is completely broken for 5 months now. Maybe I am wrong and transcoding jpeg files is not really supported, and this is why the bug is being closed?

jonsneyers commented 2 years ago

@mo271 this one is still open. The encode api silently strips exif and xmp when recompressing jpeg, which is a serious issue that needs to be resolved before 1.0.

Currently only cjxl can correctly recompress jpegs with exif/xmp metadata. In the encode api, it still needs to be fixed. I am not sure if it correctly works in the decode api either. This is a high priority bug.