Open thewtex opened 3 months ago
@jadh4v can you please take a look? We are getting slight differences with the origin
and direction
signs between the CT and SEG. @fedorov does this ring any bells?
Rendered preview: https://app.reviewnb.com/InsightSoftwareConsortium/itkwidgets/pull/757/
To make this easier to review, if I understand correctly from https://app.reviewnb.com/InsightSoftwareConsortium/itkwidgets/pull/757/, this is SEG:
origin=[-177.300003, 168.339722, 3.80999756],
spacing=[0.660156, 0.660156, 2.5],
direction=array([[ 1., 0., 0.],
[ 0., -1., 0.],
[ 0., 0., -1.]]),
size=[512, 512, 109],
and this is CT:
Size: [512, 512, 110]
Spacing: [0.660156, 0.660156, 2.5]
Origin: [-177.3, -169, -268.69]
Direction:
1 0 0
0 1 0
0 0 1
I think there are multiple issues that are/may be involved.
My recommendation has been to resample SEG to the image geometry with nearest neighbor interpolator before working with the arrays. It is not pretty, but should be lossless. I am not sure there is a better way.
@pieper do you have any comments?
@thewtex related to this PR, in
!{sys.executable} -m pip install -q pooch itk-io "itkwidgets[all]>=1.0a49" pydicom rich s5cmd itkwasm-dicom numpy
I recommend you instead of s5cmd
install idc-index
. idc-index
will bring s5cmd
as a dependency, but will also provide additional functions to simplify interaction with IDC.
For example, the cell below:
%%capture
# CT series downloaded from IDC, NLST collection (https://doi.org/10.7937/TCIA.HMQ8-J677)
!s5cmd --no-sign-request --endpoint-url https://storage.googleapis.com cp "s3://public-datasets-idc/865da32b-cea7-4e6d-b4fe-9731bf08b3d2/*" CT_DICOM_series
# segmentation of this series downloaded from IDC, from nnU-Net-BPR-Annotations (https://doi.org/10.5281/zenodo.7539035)
!s5cmd --no-sign-request --endpoint-url https://storage.googleapis.com cp "s3://public-datasets-idc/3a0c03fe-3d17-48dc-9b8d-f64f5302e0a4/*" SEG_DICOM_series
will become the following:
# CT series downloaded from IDC, NLST collection (https://doi.org/10.7937/TCIA.HMQ8-J677)
!idc download 1.2.840.113654.2.55.247854884634057477137769379611681725965
# segmentation of this series downloaded from IDC, from nnU-Net-BPR-Annotations (https://doi.org/10.5281/zenodo.7539035)
!idc download 1.2.276.0.7230010.3.1.3.481034752.249602.1661572065.734219
In the above, 865da32b-cea7-4e6d-b4fe-9731bf08b3d2
is IDC crdc_series_uuid
(not a DICOM attribute, DICOM does not support object versioning natively), and 1.2.840.113654.2.55.247854884634057477137769379611681725965
is DICOM SeriesInstanceUID
.
crdc_series_uuid
uniquely identifies the binary files associated with a series, meaning that if the specific DICOM series changes in the future but will keep the same SeriesInstanceUID
, the files corresponding to the new version of the series will get assigned a new crdc_series_uuid
.
idc download
command will pull requested DICOM series defined by SeriesInstanceUID
(it can also download content referenced by PatientID
, StudyInstanceUID
, or collection_id
, and will sort the files into hierarchy on download) from the current version of IDC.
We definitely can add the functionality to download by crdc_series_uuid
, just didn't have a use case for that. I created an issue in https://github.com/ImagingDataCommons/idc-index/issues/117.
In any case, I strongly recommend idc-index
to simplify the code and open the door for other opportunities to work with IDC for the notebook users!
@thewtex My first impression is that the coordinate system chosen for SEG is different. I suppose it depends on how the SEG was generated. Also agree with @fedorov regarding precision loss as well as skipping of slices (accounts for dim difference). Looks like we need a baseline understanding of what to expect from a pair of DICOM + SEG dataset. One way to ensure that is to generate our own data but we'd we living in our own bubble reading+writing our own test data.
Would there be a way to interpret the SEG from any third party software (even if closed source), get it converted and then compare the outputs?
@pieper do you have any comments?
Lots of good things to discuss here - a main point being that the SEG and CT are essentially two independent entities with their own spacings, dimensions, orientations, etc that are only shown together when they share a frame of reference UID (or there's a spatial registration to map between frames of reference. So we shouldn't, for example, consider the CT geometry when creating the SEG object, but rather create the most faithful representation of the input data.
the SEG and CT are essentially two independent entities with their own spacings, dimensions, orientations, etc that are only shown together when they share a frame of reference UID (or there's a spatial registration to map between frames of reference
I agree with Steve, this is a perfect summary! Also, since I wrote the above, I realized that original segmentations were created in NIfTI format using nnU-Net, which were then converted into DICOM SEG using dcmqi. Change in image orientation most likely happened during NIfTI conversion.
A separate question is why itkWidgets don't seem to be able to correctly render segmentation label that has a different orientation compared to the image? Is this a bug?
I agree with Steve, this is a perfect summary! Also, since I wrote the above, I realized that original segmentations were created in NIfTI format using nnU-Net, which were then converted into DICOM SEG using dcmqi. Change in image orientation most likely happened during NIfTI conversion.
Yes, this does seem to be the source of the problem. What would you like to do? A few ideas:
Lots of good things to discuss here - a main point being that the SEG and CT are essentially two independent entities with their own spacings, dimensions, orientations, etc that are only shown together when they share a frame of reference UID (or there's a spatial registration to map between frames of reference. So we shouldn't, for example, consider the CT geometry when creating the SEG object, but rather create the most faithful representation of the input data.
Yes, we are trying to use the metadata of the CT and SEG as encoded individually in each. But that input metadata appears to have issues.
My recommendation has been to resample SEG to the image geometry with nearest neighbor interpolator before working with the arrays.
This is already the case. In fact, we even use an interpolator with better than nearest-neighbor interpolation for labels, available in a small, universal ITK-Wasm python package :-).
If we apply the CT metadata to the SEG, this is what we get:
Note that it is also incorrect -- the heart and esophagus are inverted.
This is what the current metadata is specifying:
The empty box is the domain of the SEG -- vtk.js has a limitation of only rendering one volume.
I recommend you instead of s5cmd install idc-index. idc-index will bring s5cmd as a dependency, but will also provide additional functions to simplify interaction with IDC.
Most excellent!! :sparkler: Done. :champagne:
Yes, we are trying to use the metadata of the CT and SEG as encoded individually in each. But that input metadata appears to have issues.
I hadn't realized that, but yes, one of the liver segs here has incorrect geometry. We should get the data fixed and not use this for a demonstration or tutorial (unless we want to warn about bad metadata).
Yes, we are trying to use the metadata of the CT and SEG as encoded individually in each. But that input metadata appears to have issues.
@thewtex I don't see what are the issues with the data - the image is oriented differently, but I am not aware of any issues with the metadata. It works fine in Slicer, which is using dcmqi, screenshot below. You can see this series visualized in OHIF, which is using completely independent DICOM readers: https://viewer.imaging.datacommons.cancer.gov/v3/viewer/?StudyInstanceUIDs=1.2.840.113654.2.55.133032926508606330165457723341736723804&SeriesInstanceUIDs=1.2.840.113654.2.55.247854884634057477137769379611681725965,1.2.276.0.7230010.3.1.3.481034752.249602.1661572065.734219.
This is already the case. In fact, we even use an interpolator with better than nearest-neighbor interpolation for labels, available in a small, universal ITK-Wasm python package :-).
Where is this taking place? I did not see any resampling in the notebook code, unless I missed it.
I hadn't realized that, but yes, one of the liver segs here has incorrect geometry.
@pieper which sample are you referring to? I do not see any liver segmentations in the sample used in the notebook: https://viewer.imaging.datacommons.cancer.gov/v3/viewer/?StudyInstanceUIDs=1.2.840.113654.2.55.133032926508606330165457723341736723804&SeriesInstanceUIDs=1.2.840.113654.2.55.247854884634057477137769379611681725965,1.2.276.0.7230010.3.1.3.481034752.249602.1661572065.734219.
Thanks for the link to the actual data @fedorov. The data I was looking at came from a different study, but I don't have the uid handy. So there is a dataset with a bad liver segmentation geometry but it's not the one from this issue. Sorry for the extra confusion.
@thewtex - since Andrey and I confirmed the geometry is read correctly in OHIF and Slicer for StudyInstanceUID 1.2.840.113654.2.55.133032926508606330165457723341736723804 can you show and itk widgets example for it?
The seg and the referenced series uids are:
[0008,1115] ReferencedSeriesSequence SQ 12910
[fffe,e000] Item na 12894
[0020,000e] SeriesInstanceUID 1.2.840.113654.2.55.247854884634057477137769379611681725965 UI 60
[0020,000e] SeriesInstanceUID 1.2.276.0.7230010.3.1.3.481034752.249602.1661572065.734219 UI 58
@fedorov @pieper interesting!
Here is the metadata I get for the CT when loaded in 3D Slicer:
I was not able to load the QuantiativeReporting
extension for DCMQI in Slicer -- I get an extension not found with Slicer 5.6.2 and a backtrace when it tries to start with 5.7.0. Do you know the Slicer-loaded metadata for the SEG?
This seems to be different from what we get in this notebook,
Size: [512, 512, 110]
Spacing: [0.660156, 0.660156, 2.5]
Origin: [-177.3, -169, -268.69]
Direction:
1 0 0
0 1 0
0 0 1
I suspect there is a difference with how the CT is loaded and interpreted.
StudyInstanceUID 1.2.840.113654.2.55.133032926508606330165457723341736723804 can you show and itk widgets example for it?
Yes, that is what is in this PR.
Where is this taking place? I did not see any resampling in the notebook code, unless I missed it.
This is part of the rendering process.
~> This seems to be different from what we get in this notebook~
Well, we also have LPS / RAS to account for.
Do you know the Slicer-loaded metadata for the SEG?
This may be enlightening.
I was not able to load the QuantiativeReporting extension for DCMQI in Slicer -- I get an extension not found with Slicer 5.6.2 and a backtrace when it tries to start with 5.7.0.
Can you tell me what OS you are on? I just uninstalled/installed, and everything works for me on Mac with 5.6.2.
I checked cdash, and I do see an error for stable mac build for today, which seems to be an intermittent failure to clone the repo: https://slicer.cdash.org/builds/3504732/configure.
I also tested on 5.7.0, and did not see any problems.
Do you know the Slicer-loaded metadata for the SEG?
This part is tricky with my knowledge of Slicer.... The issue is that segmentations are not loaded into Slicer scalar volumes, but into "Slicer Segmentations". The only way I know of to see volume-level metadata (spacing, orientation, etc) is to export "Segmentation" into a volume, but that operation takes an image volume as a reference, so the result will not tell us what you want to know... Maybe @pieper or @lassoan can help here? How can we know the metadata for loaded segmentation similar to what Volumes module shows as in https://github.com/InsightSoftwareConsortium/itkwidgets/pull/757#issuecomment-2318134305?
Here's what I get if I export the segmentation as a labelmap (appears to match the CT exactly).
My suspicion is that resampling is not done correctly in itkwidgets.
Here's what I get if I export the segmentation as a labelmap (appears to match the CT exactly).
Doesn't that export process take reference volume as reference, and resamples to that reference? It would always match the geometry of the reference then, but it does not reflect geometry "as loaded".
A couple things
the extension server seems to be a bit flakey today. I tried 5.6.2 earlier on linux and got 0 extensions, then restarted Slicer and I was able to install as usual.
I'm pretty sure that the binary labelmap representation of the segmention will remain the 'primary' since that's what the SEG would be input as and won't be altered by export unless you specifically ask for it. If you loaded, e.g. an STL, it wouldn't be rasterized until you asked for it, typically in the Segment Editor in the context of a volume to use as the reference grid.
In this case the ImageOrientationPatient vectors are different in the SEG vs the CT so probably something in the reading process is converting the SEG to Axial IS.
In this case the ImageOrientationPatient vectors are different in the SEG vs the CT so probably something in the reading process is converting the SEG to Axial IS.
No, I don't think it is in the reading process. As I tried to explain earlier, I am pretty certain this is happening in the process of generating the original representation of the segmentation by nnU-Net that is then converted into SEG.
This is the approximate provenance of the SEG we are looking at:
[DICOM CT] --> nnU-Net --> [NIfTI] --> dcmqi --> [DICOM SEG] --> dcmqi --> [NRRD] --> Slicer
There is no way around it - we should just accept that orientations may differ, and make sure resampling is implemented correctly for the systems that don't render in patient space.
Actually, the pipeline is:
[DICOM CT] --> [NIfTI] --> nnU-Net --> [NIfTI] --> dcmqi --> [DICOM SEG] --> dcmqi --> [NRRD] --> Slicer
And it's possible that the first conversion from dicom to nii introduced a reorganization of the data.
By the "something in the reading process is converting the SEG to Axial IS." what I mean is that the CT and the SEG have different ImageOrientationPatient vectors in the instances, but they end up having the same IJK to RAS directions in Slicer, so they are being reorganized as part of that process. I believe it's because the data gets reshuffled in order to make the IJKToRAS transform right handed.
Here's what I get if I export the segmentation as a labelmap (appears to match the CT exactly).
Very helpful, thanks, Steve!
When we read the CT in this notebook, it looks consistent with Slicer. But the SEG is not consistent. As IJK->RAS is consistent between the CT and SEG, we should see a consistent IJK->LPS.
My suspicion is that resampling is not done correctly in itkwidgets.
The bounding box of the volumes, which is independent of any resampling, is showing that there are issues independent of resampling.
Can you tell me what OS you are on?
Linux :penguin:
[DICOM SEG] --> dcmqi --> [NRRD] --> Slicer
I wonder if we can capture the NRRD here? We could compare the header to a NRRD generated by itkwasm_image_io
from the read SEG.
[DICOM SEG] --> dcmqi --> [NRRD] --> Slicer
I wonder if we can capture the NRRD here? We could compare the header to a NRRD generated by itkwasm_image_io from the read SEG.
You can do that step running segimage2itkimage --inputDICOM [DICOM SEG] --outputDirectory . --mergeSegments
(this is exactly what is happening behind the scenes in QuantiativeReporting: https://github.com/QIICR/QuantitativeReporting/blob/master/DICOMPlugins/DICOMSegmentationPlugin.py#L104-L119). This is what I see for the SEG in question in the NRRD header:
sizes: 512 512 109
space directions: (0.66015599999999997,0,0) (0,-0.66015599999999997,0) (0,0,-2.5)
kinds: domain domain domain
endian: little
encoding: gzip
space origin: (-177.300003,168.33972199999999,3.8099975599999998)
Compared to what I see in Slicer when I load the resulting NRRD as a volume.
'm pretty sure that the binary labelmap representation of the segmention will remain the 'primary' since that's what the SEG would be input as and won't be altered by export unless you specifically ask for it.
@pieper from the above, comparing volume metadata between what you exported from Segmentations and what I loaded directly from SEG NRRD, clearly there is a resampling to reference that is happening (509 slices vs 510). Just wanted to bring your attention to this.
is a resampling to reference that is happening
That is interesting - yes, if I load the SEG alone and then convert to a binary labelmap it has 109 slices Axial SI so there is a resampling going on when there's a reference volume.
itkwidget's conversion of images to NGFF is dropping the direction array. https://github.com/InsightSoftwareConsortium/itkwidgets/blob/main/itkwidgets/integrations/__init__.py#L81 https://github.com/thewtex/ngff-zarr/blob/main/ngff_zarr/itk_image_to_ngff_image.py
I see the correct seg direction matrix in itkwidget's JS layer and correctly rendered images when I short circuit _get_viewer_image
to return the itkwasm.Image
without conversion to ngff. Right now, all images passed through ngff-zarr (and thus current itkwidgets) get an identity direction matrix when rendered.
Right now, all images passed through ngff-zarr (and thus current itkwidgets) get an identity direction matrix when rendered.
Scary stuff! I didn't realize that in order to render the image it first has to be converted into ngff-zarr.
Glad you traced it down. Hope it is solvable?
Following our Get Your Brain Together HCK03 and further work by @bogovicj, we now have an OME-Zarr RFC on Coordinate systems and transformations that includes a rotation
specification we can implement and apply in ngff-zarr to carry this information.
This is great @thewtex 👍 Do we have an example of a lossless nrrd <--> ome-zarr conversion process? I would love to see (and/or help with) a file plugin for Slicer.
example of a lossless nrrd <--> ome-zarr conversion process
I will work on these.
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB