Open jonasteuwen opened 5 months ago
Yeah, I agree it would probably be a good idea for something like this to exist.
(Should an iSyntax-to-DICOM converter then be within the scope of libisyntax, or maybe a separate project?)
As for libraries, if I were to go about it I would probably try to use libjpeg-turbo for the JPEG encoding and try to use as few as possible additional libraries, like I did for adding (partial) support for reading DICOM WSIs in Slidescape (see the code in here). But there might be easier/better ways to do it.
It could also be simpler to have either openslide bindings or python bindings directly. Extending https://github.com/imi-bigpicture/wsidicomizer/tree/main/wsidicomizer would then make it very easy
Writing it without dependencies is likely painful: https://github.com/GoogleCloudPlatform/wsi-to-dicom-converter
Yes, I think extending an existing WSI to DICOM converter could be a good strategy. Then iSyntax would be just one more backend to support.
Writing it without dependencies is likely painful: https://github.com/GoogleCloudPlatform/wsi-to-dicom-converter
I am fairly confident it would be painful, only thinking about the number of hours I already spent reading the DICOM standard... ;-)
If openslide incorporates support for isyntax (through libisyntax) conversion to DICOM should be straightforward with what is already implemented in WsiDicomizer, as it already has openslide support.
That’s true, but that still seems to require quite a bit of work. It might be much easier at this point to make a python binding. It’s on my todo
I'm playing around with python bindings at this moment using cython, and have managed to read out tiles. Not sure how easy it is to package though.
I suppose you can use cmake or meson to compile the library during install. If we would convert it to dicom it might be good to look into the XML header there are a lot of dcm tags there.
Probably it’s also a good idea to create the levels > 0 ourselves as there is an annoying offset between the levels.
Another option would be to perhaps make some GitHub actions to package it as a .deb, similar to openslide
I got a converter running using pyisyntax.
Will explore parsing metadata from the xml header. I have previously parsed some philips xml metadata from isyntax files converted to tiff, and could get some DICOM required attributes out of it
This is pretty sweet, does your DICOM converter also support its own downsampling starting at level 0? That would solve #36 (offset between levels).
An additional idea would be to have a .get_offset(level) function so we can manually offset annotations as well.
For the DICOM metadata, probably you can just make a few simple adjustments to the libisyntax code to expose those. Something like GetDicomTag
For the DICOM metadata, probably you can just make a few simple adjustments to the libisyntax code to expose those. Something like GetDicomTag
We could add some accessors to expose the relevant fields. Or maybe allow invoking a callback procedure while parsing the XML header.
This is pretty sweet, does your DICOM converter also support its own downsampling starting at level 0? That would solve #36 (offset between levels).
It can re-create a full pyramid if levels are missing.
An additional idea would be to have a .get_offset(level) function so we can manually offset annotations as well.
With annotations, do you mean graphical annotations? In what dimensions are those (pixels or meters)?
For the DICOM metadata, probably you can just make a few simple adjustments to the libisyntax code to expose those. Something like GetDicomTag
For converting to DICOM, there are some tags that are difficult to get and for the user to supply, for example
Is such metadata available in the isyntax XML?
Erik-
I never found the offset in the FIC (isyntax xml).
Interestingly, the lowest-level WSI image is never superimposed on the macro (jpeg) image in either the local PIFV viewer or in IMS, suggesting any alignment is imprecise, possibly scanner-specific.
Both viewers display small entire-slide-thumbnails, apparently simply the macro image concatenated with the label image.
I am guessing the line between those two jpeg images is considered the image column margin.
The slidescape team can probably confirm whether the WSI image matrix origin is the same as the "join" between label and macro images.
(I have glass slides I can measure to see how those (jpeg) images align with the true label edge.)
For converting to DICOM, there are some tags that are difficult to get and for the user to supply, for example
- Image orientation: How the pyramid levels are rotated with respect to the label, but this might be the same for all isyntax files.
- Total pixel matrix origin sequence: Where to first pixel in the pyramid is positioned in relation to the label corner, reference.
Is such metadata available in the isyntax XML?
The image orientation is described in Philips' file format specification document (see the attached document for details, specifically chapter 2 and the appendix). 4522 207 43941_2020_04_24 Pathology iSyntax image format.pdf
The origin of the pyramid is specified under the UFS_IMAGE_GENERAL_HEADERS tag. Here is a dump of the relevant part of the XML (dumped using Slidescape) for testslide.isyntax:
DICOM: UFS_IMAGE_GENERAL_HEADERS (0x301d, 0x2000), array
Array
DataObject ObjectType = UFSImageGeneralHeader
DICOM: UFS_IMAGE_NUMBER_OF_BLOCKS (0x301d, 0x2001), size:6 = 173421
DICOM: UFS_IMAGE_DIMENSIONS_OVER_BLOCK (0x301d, 0x2002), size:9 = 1 0 4 2 3
DICOM: UFS_IMAGE_DIMENSIONS (0x301d, 0x2003), array
Array
DataObject ObjectType = UFSImageDimension
DICOM: UFS_IMAGE_DIMENSION_NAME (0x301d, 0x2004), size:1 = x
DICOM: UFS_IMAGE_DIMENSION_TYPE (0x301d, 0x2005), size:7 = spatial
DICOM: UFS_IMAGE_DIMENSION_UNIT (0x301d, 0x2006), size:10 = MicroMeter
DICOM: UFS_IMAGE_DIMENSION_SCALE_FACTOR (0x301d, 0x2007), size:4 = 0.25
element end: DataObject
DataObject ObjectType = UFSImageDimension
DICOM: UFS_IMAGE_DIMENSION_NAME (0x301d, 0x2004), size:1 = y
DICOM: UFS_IMAGE_DIMENSION_TYPE (0x301d, 0x2005), size:7 = spatial
DICOM: UFS_IMAGE_DIMENSION_UNIT (0x301d, 0x2006), size:10 = MicroMeter
DICOM: UFS_IMAGE_DIMENSION_SCALE_FACTOR (0x301d, 0x2007), size:4 = 0.25
element end: DataObject
DataObject ObjectType = UFSImageDimension
DICOM: UFS_IMAGE_DIMENSION_NAME (0x301d, 0x2004), size:9 = component
DICOM: UFS_IMAGE_DIMENSION_TYPE (0x301d, 0x2005), size:16 = colour component
DICOM: UFS_IMAGE_DIMENSION_DISCRETE_VALUES_STRING (0x301d, 0x2008), size:13 = "Y" "Co" "Cg"
element end: DataObject
DataObject ObjectType = UFSImageDimension
DICOM: UFS_IMAGE_DIMENSION_NAME (0x301d, 0x2004), size:5 = scale
DICOM: UFS_IMAGE_DIMENSION_TYPE (0x301d, 0x2005), size:5 = scale
element end: DataObject
DataObject ObjectType = UFSImageDimension
DICOM: UFS_IMAGE_DIMENSION_NAME (0x301d, 0x2004), size:11 = waveletcoef
DICOM: UFS_IMAGE_DIMENSION_TYPE (0x301d, 0x2005), size:11 = waveletcoef
DICOM: UFS_IMAGE_DIMENSION_DISCRETE_VALUES_STRING (0x301d, 0x2008), size:19 = "LL" "LH" "HL" "HH"
element end: DataObject
element end: Array
element end: Attribute
DICOM: UFS_IMAGE_DIMENSION_RANGES (0x301d, 0x200a), array
Array
DataObject ObjectType = UFSImageDimensionRange
DICOM: UFS_IMAGE_DIMENSION_RANGE (0x301d, 0x200b), size:13 = 13531 1 52442
element end: DataObject
DataObject ObjectType = UFSImageDimensionRange
DICOM: UFS_IMAGE_DIMENSION_RANGE (0x301d, 0x200b), size:13 = 22053 1 96804
element end: DataObject
DataObject ObjectType = UFSImageDimensionRange
DICOM: UFS_IMAGE_DIMENSION_RANGE (0x301d, 0x200b), size:5 = 0 1 2
element end: DataObject
DataObject ObjectType = UFSImageDimensionRange
DICOM: UFS_IMAGE_DIMENSION_RANGE (0x301d, 0x200b), size:5 = 0 1 7
element end: DataObject
DataObject ObjectType = UFSImageDimensionRange
DICOM: UFS_IMAGE_DIMENSION_RANGE (0x301d, 0x200b), size:5 = 0 1 3
element end: DataObject
element end: Array
element end: Attribute
element end: DataObject
element end: Array
element end: Attribute
In the example of testslide.isyntax, the (padded) level 0 pyramid starts at (13531, 22053) and has a padded width/height of (38912, 74752).
In libisyntax, the relevant part of the XML is parsed here: https://github.com/amspath/libisyntax/blob/b9f0cd980a93d07602caa3536c7fef9245ae2fd9/src/isyntax/isyntax.c#L520-L540
Erik- I never found the offset in the FIC (isyntax xml). Interestingly, the lowest-level WSI image is never superimposed on the macro (jpeg) image in either the local PIFV viewer or in IMS, suggesting any alignment is imprecise, possibly scanner-specific. Both viewers display small entire-slide-thumbnails, apparently simply the macro image concatenated with the label image. I am guessing the line between those two jpeg images is considered the image column margin. The slidescape team can probably confirm whether the WSI image matrix origin is the same as the "join" between label and macro images. (I have glass slides I can measure to see how those (jpeg) images align with the true label edge.)
The macro image has its origin at (0, 0) in the coordinate system used by the iSyntax files. The offset of the WSI pyramid can be read from the XML header as described above. The label image is rotated 90 degrees compared to the macro image so that the text on the label is right-side up (see the specification document for details).
The FIC files produced by the PIFV viewer are not very useful I think, the information in there is incomplete.
BigPicture uses DICOM: https://bigpicture.eu/news/bigpicture-raises-dicom-standards and it is likely to become an industry standard.
I can write a converter if you want, but it might be tricky to know which libraries to import.
Shall I draft something?