thorfdbg / libjpeg

A complete implementation of 10918-1 (JPEG) coming from jpeg.org (the ISO group) with extensions for HDR, lossless and alpha channel coding standardized as ISO/IEC 18477 (JPEG XT).
327 stars 81 forks source link

Huffman table contains 256 entries (for no good reason) #100

Closed SimonSegerblomRex closed 4 months ago

SimonSegerblomRex commented 4 months ago

Problem Following this advice in the README:

The following options exist for lossless coding integer:

predictive Rec. ITU-T T.81 | ISO/IEC 10918-1 coding. Note, however, that not many implementations are capable of decoding such stream, thus this is probably not a good option for all-day purposes.

$ jpeg -p -c infile.ppm out.jpg

While the result is a valid Rec. ITU-T T.81 | ISO/IEC 10918-1 stream, most other implementations will hick up and break, thus it is not advisable to use it.

libjpeg produces JPEG files that libjpeg-turbo refuses to decode with the error message

Bogus Huffman table definition

due to this check (discussed here).

Steps to reproduce Encode this image (or any(?) other PGM file) using libjpeg:

 ./jpeg -p -c 14bit.pgm out.jpg

Observe that the Huffman table has 256 entries.

Workaround Encode the image using option -h:

 ./jpeg -p -c -h 14bit.pgm out.jpg

Observe that the Huffman table has 10 entries.

Reasoning One might argue that libjpeg-turbo is at fault not decoding this image since it's technically a valid JPEG file according to the specification, but the 256 entries table doesn't make sense either and seems to be bug in libjpeg. Also following up on your comment here @thorfdbg:

If there are, for some image or some configuration, tables generated that populate more than the first 16 entries, please let me know because I would not know how this has happened (code for that in coding/huffmantemplate.cpp, lines 142 and following, val_dc_luminance/chrominance lists the symbols).

SimonSegerblomRex commented 4 months ago

I guess this might be a duplicate of https://github.com/thorfdbg/libjpeg/issues/88#issuecomment-1791584122... The documentation should be updated though, and it's a bit unfortunate that the default behaviour (when not using -h) is to produce JPEG files incompatible with libjpeg-turbo.

thorfdbg commented 4 months ago

Thank you for reporting. Unlike the 12bit lossy case, I really do not recall in which mode the encoder could possibly need the additional symbols, so I removed them.

Fixed in 1.69, thank you.

SimonSegerblomRex commented 4 months ago

Thank you for the quick fix!