thorfdbg / libjpeg

A complete implementation of 10918-1 (JPEG) coming from jpeg.org (the ISO group) with extensions for HDR, lossless and alpha channel coding standardized as ISO/IEC 18477 (JPEG XT).
327 stars 81 forks source link

fix an overly strict constraint in the huffman table generation logic, w... #1

Open richard42 opened 9 years ago

richard42 commented 9 years ago

...hich causes the software to throw an error and reject legal huffman tables.

I am using your library to decode CinemaDNG images from a new BlackMagic Ursula cinema camera. It failed to decode the images with the following error:

reading a JPEG file failed - error -1038 - Huffman table corrupt - entry depends on more bits than available for the bit length

To debug this, I place the following statement at line 546:

 printf("Huffman table: There are %i code words with length %i\n", m_ucLengths[i], i+1);

The output was:

Huffman table: There are 0 code words with length 1 Huffman table: There are 3 code words with length 2 Huffman table: There are 1 code words with length 3 Huffman table: There are 1 code words with length 4 Huffman table: There are 1 code words with length 5 Huffman table: There are 1 code words with length 6 Huffman table: There are 1 code words with length 7 Huffman table: There are 1 code words with length 8 Huffman table: There are 1 code words with length 9 Huffman table: There are 1 code words with length 10 Huffman table: There are 1 code words with length 11 Huffman table: There are 1 code words with length 12 Huffman table: There are 2 code words with length 13

So it's a legal table, but the last 3 bit lengths have no code values. As a result of the truncated (but legal) table, this huffman generator logic throws an error. My code change applies the correct constraint which allows for tables which end early, but will still fail tables which are actually corrupted.

richard42 commented 9 years ago

I went back a very old version of libjpeg 6b which was patched for lossless support to try and get this working with the Ursula camera images. I saw that your project is GPL3 licensed and we cannot integrate it with our commercial application. The old libjpeg 6b had also complained about invalid huffman tables. I found that the code was completely different than yours, and it contained this comment:

/* code is now 1 more than the last code used for codelength si; but * it must still fit in si bits, since no code is allowed to be all ones. */

If this is true that no huffman code is allowed to be all ones (is this used for an escape code?) then the camera is generating an illegal stream. Thanks, Adobe.

thorfdbg commented 9 years ago

Richard,

...hich causes the software to throw an error and reject legal huffman tables.

I am using your library to decode CinemaDNG images from a new BlackMagic Ursula cinema camera. It failed to decode the images with the following error:

reading a JPEG file failed - error -1038 - Huffman table corrupt -
entry depends on more bits than available for the bit length

thanks for your report. Would it be possible to send me the image that causes the failure? I would like to understand the issue a bit better.

Anyhow, before you use the git version - there is a later version you may want to try. As you might have noted, the library is the reference implementation for the next JPEG standard (JPEG XT), and you may get the latest version on our committee home page at www.jpeg.org. From there, navigate to JPEG XT, then software. You can download there one of the two versions, the GPL version (all of JPEG including arithmetic coding, hierarchical and JPEG LS, JPEG XT parts 6, 8, 9 and part 7 profile C) and the ISO version (JPEG Huffman coding only, no arithmetic, no hierarchical, no JPEG LS, but all parts of JPEG XT, i.e. 6,7 (complete), 8 and 9). As JPEG XT is an extension of JPEG, it will do exactly what the git version does - and is in fact based on JPEG.

As you will find, the official version on jpeg.org is quite a bit ahead of the git version. It may (or may not) fix the problem you have observed.

Greetings, Thomas

thorfdbg commented 9 years ago

Am 05.05.2015 um 20:47 schrieb Richard Goedeken:

I went back a very old version of libjpeg 6b which was patched for lossless support to try and get this working with the Ursula camera images. I saw that your project is GPL3 licensed and we cannot integrate it with our commercial application.

It depends what you need. The ISO version of the same software on www.jpeg.org is not GPL'd, but it does not include lossless JPEG. It includes all of JPEG, and lossless JPEG XT, which is a backwards-compatible version of the old JPEG. If you need a (commercial) license of the GPL'd code, this can be arranged, but it won't be for free.

Note also that there are a couple of corrupt lossless JPEG images around that use an invalid code for a special amplitude class (the largest class is coded indirectly). The github version and the version www.jpeg.org only support correct codestreams - as this is by design a reference implementation, and it is not supposed to aid as a reference for working around bugs of other implementations.

The old libjpeg 6b had also complained about invalid huffman tables. I found that the code was completely different than yours, and it contained this comment:

/* code is now 1 more than the last code used for codelength si; but
* it must still fit in si bits, since no code is allowed to be all ones.
*/

If this is true that no huffman code is allowed to be all ones (is this used for an escape code?) then the camera is generating an illegal stream. Thanks, Adobe.

Actually, there is no (explicit) requirement in the standard that a Huffman code cannot be all ones, and there is neither an escape code reserved for that. There is an (implicit) requirement that only the largest (and last) Huffman code could consist of all ones, simply by the way how JPEG constructs Huffman codes. Or rather, based on the fact that JPEG uses so-called "canonical Huffman codes" for which this fact holds. It is rather a corollary from how these "special codes" work.

However, before I can actually make a statement on whether this is a defect of the software, or a defect of the codestream, I would need to look at the codestream itself.

Greetings, Thomas

richard42 commented 9 years ago

Thanks for the info. I actually found and tried the Jpeg XT library yesterday, but I only ran the ISO licensed version and of course it said that it didn't support the lossless file. I will send you the file at your email address tomorrow for your analysis. If the all-ones huffman code is legal then I'm pretty sure that the code here has a defect, and my patch is the correct fix. I've written huffman codecs before, so I understand the algorithm. My manager should be contacting you in the next day or two to discuss licensing terms.