ImagingDataCommons / highdicom

High-level DICOM abstractions for the Python programming language
https://highdicom.readthedocs.io
MIT License
168 stars 35 forks source link

Convert color space after frame decoding #152

Closed hackermd closed 2 years ago

hackermd commented 2 years ago

This PR introduces changes for color management during decoding of JPEG compressed frames. Currently the pydicom library does not automatically transform image frames into RGB color space. I consider this a bug in pydicom, but it's unclear whether it will be fixed in pydicom or whether we will have to work around the issue in highdicom.

Please refer to the discussion on the following pydicom PR: https://github.com/pydicom/pydicom/pull/1570#issuecomment-1040330739

hackermd commented 2 years ago

I marked this PR as a draft, because I thought we may want to wait until we have further clarify from the pydicom maintainers on what changes may be made to pydicom and what changes may be necessary in highdicom. However, the decode_frame() function currently returns wrong values (or at least highly unexpected values) and we could consider merging this PR now and creating another PR in the future if the behavior will change in a future version of pydicom. @CPBridge thoughts?

hackermd commented 2 years ago

Not converting the colour space is consistent with pydicom's philosophy of not altering the data it returns, much like it does not apply modality LUT transformations (rescale/intercept) in, e.g., CT and I prefer that they have a consistent philosophy on the matter even if it might not be the one I would choose. From my reading of that thread, colour space conversion will be added in the new pixel data engine, presumably optionally.

My point is that it alters the data. If one just decodes the JPEG bitstream, one generally gets an image in RGB color space (because the codec assumes that the image has been converted from RGB to YBR upon encoding). The pydicom library currently converts the data to YBR after decoding.

Anyway, since returning YBR data is in my opinion valid, I really don't think that our API should force users to convert the colour space to RGB, so I would suggest adding a boolean parameter to control this behaviour. It would probably make sense to have it True (meaning returned data will be RGB regardless of the stored photometric interpretation) by default but allow users to disable this if they like and they know what they are doing.

I am not sure it's valid. As far as I understand it, if the image is compressed, Photometric Interpretation describes the interpretation of the compressed bitstream. Since JPEG codec generally convert images from RGB to YBR upon compression, the value of Photometric Interpretation is "YBR_FULL_422" (super confusing if you ask me). However, the data usually is acquired in RGB color space and that's also what the device ICC profile says. When one would apply the ICC profile to the image in YBR color space, one would get an incorrect color. Therefore, I tend argue that the data is not valid.

PS3.5 8.2.1 JPEG Image Compression:

If JPEG Compressed Pixel Data is decompressed and re-encoded in Native (uncompressed) form, then the Data Elements that are related to the Pixel Data encoding are updated accordingly. If color components are converted from YBR_FULL_422 to RGB during decompression and Native re-encoding, the Photometric Interpretation will be changed to RGB in the Data Set with the Native encoding.

We could also consider whether we would want to add some of the other transformations that I consider equally as basic, such as modality LUT, to this function also. What do you think?

I think the Modality LUT is conceptually different from decompressing encapsulated pixel data frame items. However, we should provide an API to apply those transformations as well (maybe similar to the ColorManager?)

hackermd commented 2 years ago

@CPBridge one more thing regarding JPEG. The biggest issue at the moment is in my opinion that pydicom is inconsistent. It converts JPEG encoded images to YBR based on the value of Photometric Intpretation, but does not convert the color space for JPEG 2000 or JPEG-LS encoded images.

CPBridge commented 2 years ago

Thanks for the explanation, I think you've largely convinced me. By the way I'm not sure this all came across so clearly to me on the pydicom thread (1570) when I read it.

I think it makes sense to merge now if it will be useful to you.

hackermd commented 2 years ago

Thanks for the explanation, I think you've largely convinced me. By the way I'm not sure this all came across so clearly to me on the pydicom thread (1570) when I read it.

I think it makes sense to merge now if it will be useful to you.

I added a detailed comment to the source code and restricted the color space conversion to JPEG baseline 8-bit transfer syntax. We may need to revisit this depending on what the pydicom maintainers decide to do.