imi-bigpicture / wsidicom

Python package for reading DICOM WSI file sets.
Apache License 2.0
30 stars 5 forks source link

Adding support for more transfer syntaxes #119

Closed psavery closed 7 months ago

psavery commented 8 months ago

We are currently performing some tests with a DICOMweb server that has some examples in Explicit VR Little Endian and JPEG-LS format. It would be nice if we could support as many formats as possible in wsidicom.

If we have a transfer syntax that Pillow doesn't support, what do you think about relying on pydicom's pixel_data_handlers to first convert the bytes to a numpy array, and then convert it to Pillow via Image.fromarray()?

Those pixel_data_handlers currently do not support decoding individual frames. However, there is a PR up for such support, so it will hopefully be supported in the future. In the meantime, highdicom includes a decode_frame() function that takes as arguments some of the attributes on the DICOM dataset. It then creates a "fake" DICOM dataset and utilizes pydicom's pixel_data_handlers to convert the frame to a numpy array. highdicom is planning to deprecate and remove this function when pydicom starts supporting frame decoding. We could potentially add highdicom as an optional dependency for this functionality until pydicom supports it.

What do you think?

erikogabrielsson commented 8 months ago

Using pydicoms pixel_data_handlers as a fall back for transfer syntaxes that Pillow does not support sounds like a good idea. I made a quick test here . I dont have any examples of wsis with non-Pillow-supported transfer syntax, but I guess I can create some to test with.

erikogabrielsson commented 8 months ago

Another issue related to this is that currently all image processing is done in Pillow. This works Ok for 8 bit color images and gray scale of higher bit depth. Looking at the dicom standard: -JPEG Baseline is restricted to 8 bits. OK with Pillow

I dont think >8 bit color images are common for WSI, so maybe it is not a problem to no support it. Alternatively we need to change the image processing (cropping, stitching, scaling) to be done in for example numpy.

psavery commented 8 months ago

Using pydicoms pixel_data_handlers as a fall back for transfer syntaxes that Pillow does not support sounds like a good idea. I made a quick test here . I dont have any examples of wsis with non-Pillow-supported transfer syntax, but I guess I can create some to test with.

Nice test code!

You can try the public slim examples here.

To access them, use the following arguments for the DICOMwebClient:

client = DICOMwebClient(
    'https://idc-external-006.uc.r.appspot.com/dcm4chee-arc/aets/DCM4CHEE/rs',
    qido_url_prefix=None,
    wado_url_prefix=None,
)

Most of those examples have their pixel data uncompressed (ExplicitVRLittleEndian). If you call Image.open() on these, however, Pillow won't be able to identify the format. pydicom's numpy_handler looks at several DICOM attributes to convert this uncompressed data into a numpy array. This might mean it's good for us to rely on pydicom's pixel data handlers even for uncompressed data.

Those slim examples also have examples with different JPEG variants, including one that is JPEG-LS.

There are also some extensions to Pillow that might include handlers for things like JPEG-LS (for example here). However, I don't know how well supported they are (that JPEG-LS example hasn't been updated in over a year). I think it's nice to just rely on pydicom to figure out these conversions for us, and stay up-to-date with dependencies!

psavery commented 8 months ago

Another issue related to this is that currently all image processing is done in Pillow. This works Ok for 8 bit color images and gray scale of higher bit depth.

I dont think >8 bit color images are common for WSI, so maybe it is not a problem to no support it. Alternatively we need to change the image processing (cropping, stitching, scaling) to be done in for example numpy.

If we switch it to use and return numpy arrays in wsidicom, that will work fine with us for large_image! We can handle numpy arrays also.

erikogabrielsson commented 7 months ago

Fixed by #126

Another issue related to this is that currently all image processing is done in Pillow. This works Ok for 8 bit color images and gray scale of higher bit depth.

I dont think >8 bit color images are common for WSI, so maybe it is not a problem to no support it. Alternatively we need to change the image processing (cropping, stitching, scaling) to be done in for example numpy.

If we switch it to use and return numpy arrays in wsidicom, that will work fine with us for large_image! We can handle numpy arrays also.

I will have to check what good alternatives there are for handling numpy image data, especially when it comes to scaling.

erikogabrielsson commented 7 months ago

Fixed by #126

psavery commented 7 months ago

@erikogabrielsson By the way, it looks like pydicom's RLE decoder only requires numpy. Did you consider using that instead of pylibjpeg-rle? The only problem with pylibjpeg-rle is that it doesn't have python3.11 wheels, so we can't use it when we have python3.11. It looks like pylibjpeg-rle hasn't been actively supported (the last update was over a year ago).

erikogabrielsson commented 7 months ago

Yes one can decode RLE with only pydicom, but according to the pylibjpeg-rle author it is significantly slower. I implemented a decoder using imagecodecs, but imagecodecs does not yet support padded bytes as used in DICOM RLE.

But the pylibjpeg-rle should be an optional extra? Are you unable to install wsidicom on >=3.11 even without the extra?

psavery commented 7 months ago

We can install wsidicom for python versions 3.9 - 3.12, but we would only be able to include pylibjpeg-rle for python versions 3.9 and 3.10. So users of our program that use Python 3.11 and 3.12 won't be able to decode RLE (unless it falls back to using the pydicom decoder for those versions).

erikogabrielsson commented 7 months ago

You should be able to read RLE images using the PydicomDecoder and the ImageCodecsRleEncoder (we need to be able to also encode the format if scaled or cropped encoded tiles are requested). ImageCodecs should support Python 3.11 and 3.12.