Closed vkt1414 closed 2 months ago
@vkt1414 based on the discussion in the issue above, can you run a check for the consistency of the first component of ImageType
, and see how many of the series would be filtered out? Since we did not have any such failures for NLST, it means this issue is rare and did not affect the prior cohort analyzed.
30 of 35 had inconsistent image type values in a series. Of the remaining 5, 4 have gantry tilt errors. There's only 1 mysterious series left with series uid 1.2.840.113654.2.55.181615646167494492707070609724375539907
I was curious to see if there are any series with inconsistent image type in the 126k cohort. In total there are 63 series. So, there are 33 other series, that were somehow able to convert properly, which makes me wonder if there is some other issue was blocking the conversion of 30 series. I summarized the 63 series in a csv file as we are dealing with a lot of numbers. 63_inconsistent_imagetype_series.csv
@vkt1414 can you share with my a series where there is a gantry tilt error as well as the remaining problematic image. Ideally, you could send me a link to the zipped DICOMs from google drive to my institutional email or provide a URL I can download with curl.
@vkt1414 two images in the series 1.2.840.113654.2.55.181615646167494492707070609724375539907 are corrupted. You can see this by running the following code in Python":
from idc_index import index
c = index.IDCClient()
c.download_from_selection(seriesInstanceUID="1.2.840.113654.2.55.181615646167494492707070609724375539907", downloadDir="./")
Note that 126 of the 128 files include image data and have sizes around 527kb. However, two files are unusually small (2kb): 2b6326b8-a790-43af-b764-26b6883e0516.dcm
, ded44b5b-2281-4d44-969a-b1830b116c72.dcm
. Viewing the dicom header (e.g. using dcmdump) shows these files are missing the image data.
Hi @neurolabusc ,
Thank you very much for diagnosing the mysterious series. We will look into how the series made it to IDC.
Please find the links to GitHub release attachments that should be downloadable with curl or wget, containing: 4 series that showed gantry tilt errors https://github.com/vkt1414/CloudSegmentator/releases/download/test/GantryTiltError.zip
30 series that were NOT able to convert with inconsistent ImageType https://github.com/vkt1414/CloudSegmentator/releases/download/test/Inconsistent.ImageType.zip
33 series that were able to convert despite inconsistent ImageType https://github.com/vkt1414/CloudSegmentator/releases/download/test/dcmniix_processed_despite_inconsistent_imageType_part1.zip https://github.com/vkt1414/CloudSegmentator/releases/download/test/dcmniix_processed_despite_inconsistent_imageType_part2.zip
I included all series for completeness but please feel free to pick and choose selectively. I'm curious how some series were able to, and some weren't able to convert to NIfTI with inconsistent ImageType.
Thank you very much for your help. We really appreciate it!
@neurolabusc Just wondering if you had a chance to review how the inconsistent ImageType series were able to be converted by dcm2niix. Thank you!
@vkt1414 if you want feedback, provide a minimal demo. A 1 Gb download with 24 series acquired on three separate sessions does not make it explicit what your issue is. You can always rename and reorganize your DICOMs with dcm2niix -r y /path/to/DICOMs
to extract out specific series that are causing you confusion.
@vkt1414, as it turns out, TCIA requires that you get approval in order to access their NLST data... which is kind of ironic considering that it's available from IDC. I have access, so downloaded a zip of the series in question, and which you can get at gs://whc_etl_dev/1.2.840.113654.2.55.181615646167494492707070609724375539907.zip There are, indeed, two instances in the zip which are 2300B.
@bcli4d thank you very much! I was going to message you on slack about them.
@vkt1414 if you want feedback, provide a minimal demo. A 1 Gb download with 24 series acquired on three separate sessions does not make it explicit what your issue is. You can always rename and reorganize your DICOMs with
dcm2niix -r y /path/to/DICOMs
to extract out specific series that are causing you confusion.
@neurolabusc I apologize if the previously attached files are overwhelming. I now included only two series, one for kind of error: Gantry error and inconsistent ImageType yet somehow dcm2niix converted them to NIfTI without any trouble.
https://github.com/vkt1414/CloudSegmentator/releases/download/test/dcm2niix_troubleshooting.zip
I'm more interested in the latter as the claim that 'dcm2niix can't convert series with inconsistent ImageType' does not seem to hold true all the time.
Thank you very much!
@vkt1414, as it turns out, TCIA requires that you get approval in order to access their NLST data... which is kind of ironic considering that it's available from IDC. I have access, so downloaded a zip of the series in question, and which you can get at gs://whc_etl_dev/1.2.840.113654.2.55.181615646167494492707070609724375539907.zip There are, indeed, two instances in the zip which are 2300B.
@fedorov I double checked with pydicom and confirmed that there is no pixel data in those two DICOM files, even from TCIA. Please advise what steps we should take next.
I think there are several action items:
@vkt1414
ORIGINAL\PRIMARY\AXIAL\CT_SOM5 SPI
and others saved with ORIGINAL\SECONDARY\AXIAL\CT_SOM5 SPI
. It would be wise not to combine these. However, dcm2niix only avoids combining derived and non-derived images from the same series, and neither is labeled as DERIVED
so dcm2niix concatenates these.@vkt1414
- one series has some files saved with
ORIGINAL\PRIMARY\AXIAL\CT_SOM5 SPI
and others saved withORIGINAL\SECONDARY\AXIAL\CT_SOM5 SPI
. It would be wise not to combine these. However, dcm2niix only avoids combining derived and non-derived images from the same series, and neither is labeled asDERIVED
so dcm2niix concatenates these.- I do not get a gantry tilt warning with the current stable release of dcm2niix. I suspect that the recent commit to increase the gantry tilt tolerance explains this.
@neurolabusc Thank you very much for your help! We really appreciate it.
35 out of 126088 series failed at the dcm2niix step. I tried running these on a colab notebook and here's the breakdown of warning messages from dcm2niix. But I cannot seem to analyze why these failed.
29 series had these warnings:
and 4 series had gantry tilt warnings:
and 2 series had PatientOrientation warnings which are seen very commonly
Here's the list of the series:
only these 5 series exceeded tolerance warning in slicer.
The notebook and list of series along with slicer idcbrowser urls are attached. https://colab.research.google.com/drive/1oMbG_xImkcE5bsc5MIN_2B2yU6bZe2Sg?usp=sharing slicer_links.csv