ImagingDataCommons / libdicom

C library for reading DICOM files
https://libdicom.readthedocs.io
MIT License
14 stars 7 forks source link

Pyramid built incorrectly for some Philips scans #79

Open jcupitt opened 7 months ago

jcupitt commented 7 months ago

dclunie wrote:

Here are some more samples from 3DHISTECH, Hamamatsu, Leica, Objective Imaging, Philips and Roche that were used in the DICOM WG 26 2023 ECDP Connectathon, which you might want to add to your test set:

ftp://medical.nema.org//MEDICAL/Dicom/DataSets/WG26/WG26Connectathon2023_ECDP

These are CC0 license (public domain, no restrictions).

David

PS. There is a problem with the single Philips scan - it seems to jump around as one zooms in and out, so its pyramid does not seem to be encoded correctly.

This is probably an issue in the openslide vendor driver, but we should check libdicom as well.

dclunie commented 7 months ago

Actually I am not sure that there is any problem with openslide or libdicom in this respect - I think the Philips data is "bad". BioFormats and Slim show the same problem.

jcupitt commented 3 months ago

Thanks for the samples, @dclunie. I'll have a look at adding them to the test suite.

The philips image actually works in the latest vipsdisp, though it's extremely slow. openslide is detecting the levels as:

openslide.level[0].downsample: 1
openslide.level[1].downsample: 1.945054945054945
openslide.level[2].downsample: 3.8901098901098901
openslide.level[3].downsample: 7.0178571428571423
openslide.level[4].downsample: 13.125
openslide.level[5].downsample: 26.25
openslide.level[6].downsample: 39

Which is far enough away from powers of two that vipsdisp treats the file as a single level (!!) and only uses level[0].

They seem to be always writing whole 1024 x 1024 tiles and using PixelSpacing to indicate level reduction.

openslide is expecting fractional tiles and using the way TotalPixelMatrixColumns changes to figure out where each level belongs.

Does anyone have a view on the correct behaviour here? I'm not sure if PixelSpacing is a required field, so perhaps we need to use that if present and otherwise fall back to TotalPixelMatrixColumns. Does that sound reasonable?

We might also need to crop levels in openslide to force fractional tiles. I've asked the openslide devs for an opinion on that.

jcupitt commented 3 months ago

As an aside, the philips PixelSpacing tags follow a nice ^2 order, so using that field should work.

Filename PixelSpacing
SM4.dcm 0.00025
SM5.dcm 0.0005
SM3.dcm 0.001
SM6.dcm 0.002
SM2.dcm 0.004
SM0.dcm 0.008
SM1.dcm 0.016
bgilbert commented 3 months ago

TIFFs exported by the Philips software have long had this same behavior: edge tiles in downsampled levels are padded to a tile boundary, and the metadata claims the padding is a valid part of the image. OpenSlide works around it exactly as John suggested, using the ratios between levels' pixel spacings to decide how to crop the downsampled levels. We could do the same here.

@dclunie, do you know whether this was just a random bug caught at a connectathon, or whether it'll show up in Philips DICOMs in the wild? In the latter case we'd probably want to add a workaround to OpenSlide.

jcupitt commented 3 months ago

I tried the other DICOMs in the WG26Connectathon2023_ECDP set and they all worked well, except for 3dhistech/Case-A.

This fails using the venerable dcmdump with errors like:

$ dcmdump 6_0 
W: DcmSequenceOfItems: Length of item in sequence PixelData (7fe0,0010) is odd
E: DcmSequenceOfItems: Parse error in sequence (7fe0,0010), found (00ff,40e0) instead of sequence delimiter (fffe,e0dd)
E: dcmdump: Sequence Delimitation Item missing: reading file: 6_0

And

$ dcmdump 2
E: dcmdump: I/O suspension or premature end of stream: reading file: 2

libdicom also fails for this file, but I think if dcmdump fails it's probably bad data and we don't need to worry.