ome / ome-codecs

OME image compression (coder-decoder implementations)
Other
4 stars 13 forks source link

Allow the Huffman decoding tree to go all the way down to the 16-bit depth required by the standard #16

Closed LoopinFool closed 5 years ago

LoopinFool commented 5 years ago

I found some medical images in the wild that the ome-codecs library failed to decode properly. The resulting output values were completely wrong. I found that these image files had huffman tables that included 16-bit codes. Because of that I found that the Huffman codec code stopped creating its decoder tree at the 15-bit level.

If you look at the while loop immediately above my change you can see that it is legal for "next" to equal LEAVES_OFFSET because it can still increment "next" when it equals (LEAVES_OFFSET-1) and it won't know it's done until i is >= leafCounter. All of that code is correct.

Furthermore, the root node of the decoder tree is purely a head (doesn't store any leaves). So, bit 1 is stored at level 1, bit 2 at level 2, etc. So in order to store a bit 16, next has to be allowed to go all the way to 16 but the old code limited it to 15.

The problem images I have are medical DICOM files that have been anonymized. I can provide one if absolutely necessary. However, you can see that the code change can't change the behavior for any files with (sane) Huffman tables whose largest codes have fewer than 16 bits. This change merely makes it work properly for more files.

sbesson commented 5 years ago

Closing/reopening to include b6b4eb05281b227515f9261b8717542f161f6dd1 in the Travis CI.

Thanks for your contribution @LoopinFool we will review it. Do you have sample files that help to reproduce the initial issue and are readable with this fix that could be shared for testing?

LoopinFool commented 5 years ago

As I said, I do have DICOM files that exhibit this issue. They load correctly in other dicom software. Can you make use of those files as-is or shall I use a binary editor to extract the JPEG code stream from inside the files?

sbesson commented 5 years ago

Our preference goes towards using the original files whenever possible. If the DICOM files are small enough and can be made public, you could attach them here. Other if the data needs to be kept private, you can upload them at http://qa.openmicroscopy.org.uk/qa/upload/ under 2GB. Let us know if this does not work or if we need to set up another type of transfer.

LoopinFool commented 5 years ago

While these images have been anonymized, it's still best to keep them semi-private. Thank you for having an upload space for that. I have uploaded the entire set of images as huffman-table-problem.zip

mtbc commented 5 years ago

The code change makes sense. Thank your for the images in QA 27761 which we acknowledge are not public samples. Recommend adding include label for further testing.