ome / bioformats

Bio-Formats is a Java library for reading and writing data in life sciences image file formats. It is developed by the Open Microscopy Environment. Bio-Formats is released under the GNU General Public License (GPL); commercial licenses are available from Glencoe Software.
https://www.openmicroscopy.org/bio-formats
GNU General Public License v2.0
378 stars 241 forks source link

bug: bfconvert generates bad DICOMs from SVS files #4161

Open jcupitt opened 6 months ago

jcupitt commented 6 months ago

Hello everyone, thank you for this nice thing.

While working on openslide, I've come across a bfconvert bug when generating DICOM files.

DICOM has a photometricinterpretation tag to indicate either RGB or YCbCr colorspace in tiles. This (according to the DICOM spec) is the place where the tile colourspace is kept, and NOT in the JPEG tiles themselves. On decode, you need to open each tile, and force the tile colorspace from the DICOM header.

If you use -precompressed, conversion will copy over the JPEG tiles untouched, so if the DICOM header and the tile colorspace were correct beforehand, they will still match in the converted image.

If you convert without the precompressed flag, bfconvert will reencode the JPEGs and may well change the photometric interpretation. For example, SVS is saved as RGB (no chroma subsample), but bfconvert will save as YCbCr (chroma subsample). Now the DICOM photometric interpretation will be RGB, but the tiles will be YCbCr, so users will see crazy colors.

tldr: when saving DICOM, if tiles are being recompressed, bfconvert needs to update the DICOM photometric interpretation tag.

Referring openslide issue: https://github.com/openslide/openslide/pull/558

Referring libdicom issue: https://github.com/ImagingDataCommons/libdicom/issues/80

bgilbert commented 6 months ago

Also, even with -precompressed, bfconvert re-encodes the label and overview images to YCbCr but leaves their PhotometricInterpretation values as RGB.

melissalinkert commented 6 months ago

Thanks for reporting this, @jcupitt / @bgilbert. Just to make sure I understand before implementing a fix, is the following what you would expect to be sufficient:

/cc @dclunie, @fedorov

jcupitt commented 6 months ago

Hi @melissalinkert, thanks for working on this.

I think you can assume the input file is correct, so if you just copy over the JPEG images, you don't need to do anything.

If you do a decompress or compress, you need to look at and perhaps set the DICOM photometricinterpretation tag.

  1. On decompress you need to set the libjpeg input colorspace from the DICOM metadata (don't rely on libjpeg to get this right, it'll miss with eg. SVS) during decompressor setup.
  2. Conversely, on compress, you need to set the DICOM tag to the colourspace you compressed the JPEGs to.
  3. And as Benjamin says, this also applies to the thumbnail / macro / label etc.