cutright / DVH-Analytics

A DICOM Database Application for Radiation Oncology
Other
82 stars 30 forks source link

Cancer Imaging Archive (TCIA) DICOM import failed #131

Closed mateuszbaran closed 3 years ago

mateuszbaran commented 3 years ago

Hi!

I tried importing a few different DICOM datasets from from different sources and all imports failed in different ways. For some sets the program hangs, for other it closes before the import is complete, and sometimes importing window just disappears and nothing is added to the database. One of the datasets I tried is the HNSCC dataset available on this site: https://www.cancerimagingarchive.net/nbia-search/ . I'm using version 0.9.1 on Windows. Here is my log file:

2020-12-31 19:47:23,810 - dvha - WARNING - DICOM_Parser: get_time_stamp failed
time data '' does not match format '%Y%m%d'
2020-12-31 19:47:23,811 - dvha - WARNING - DICOM_Parser: get_time_stamp failed
strptime() argument 1 must be str, not None
2020-12-31 19:47:23,812 - dvha - ERROR - Unhandled exception: Traceback (most recent call last):
  File "DVH-Analytics\dvha\main.py", line 1221, in Run
  File "threading.py", line 865, in run
  File "DVH-Analytics\dvha\models\import_dicom.py", line 1490, in import_target
  File "DVH-Analytics\dvha\models\import_dicom.py", line 1193, in __init__
  File "DVH-Analytics\dvha\models\import_dicom.py", line 1221, in run
  File "DVH-Analytics\dvha\db\dicom_parser.py", line 255, in get_plan_row
  File "DVH-Analytics\dvha\db\dicom_parser.py", line 593, in total_mu
  File "DVH-Analytics\dvha\db\dicom_parser.py", line 1435, in fx_count
ValueError: invalid literal for int() with base 10: '24.000'

2021-01-02 14:35:48,492 - dvha - WARNING - DICOM_Parser: get_time_stamp failed
time data '' does not match format '%Y%m%d'
2021-01-02 14:35:48,507 - dvha - WARNING - DICOM_Parser: get_time_stamp failed
strptime() argument 1 must be str, not None
2021-01-02 14:35:48,508 - dvha - ERROR - Unhandled exception: Traceback (most recent call last):
  File "DVH-Analytics\dvha\main.py", line 1221, in Run
  File "threading.py", line 865, in run
  File "DVH-Analytics\dvha\models\import_dicom.py", line 1490, in import_target
  File "DVH-Analytics\dvha\models\import_dicom.py", line 1193, in __init__
  File "DVH-Analytics\dvha\models\import_dicom.py", line 1221, in run
  File "DVH-Analytics\dvha\db\dicom_parser.py", line 255, in get_plan_row
  File "DVH-Analytics\dvha\db\dicom_parser.py", line 593, in total_mu
  File "DVH-Analytics\dvha\db\dicom_parser.py", line 1435, in fx_count
ValueError: invalid literal for int() with base 10: '24.000'

2021-01-02 14:55:41,378 - dvha - WARNING - DICOM_Parser: get_time_stamp failed
time data '' does not match format '%Y%m%d'
2021-01-02 14:55:41,378 - dvha - WARNING - DICOM_Parser: get_time_stamp failed
strptime() argument 1 must be str, not None
2021-01-02 14:55:41,379 - dvha - ERROR - Unhandled exception: Traceback (most recent call last):
  File "DVH-Analytics\dvha\main.py", line 1221, in Run
  File "threading.py", line 865, in run
  File "DVH-Analytics\dvha\models\import_dicom.py", line 1490, in import_target
  File "DVH-Analytics\dvha\models\import_dicom.py", line 1193, in __init__
  File "DVH-Analytics\dvha\models\import_dicom.py", line 1221, in run
  File "DVH-Analytics\dvha\db\dicom_parser.py", line 255, in get_plan_row
  File "DVH-Analytics\dvha\db\dicom_parser.py", line 593, in total_mu
  File "DVH-Analytics\dvha\db\dicom_parser.py", line 1435, in fx_count
ValueError: invalid literal for int() with base 10: '10.000'

2021-01-02 18:43:25,056 - dvha - WARNING - StudyImporter.run: Skipping PTV related calculations. No PTV found for mrn: HN-HGJ-092
2021-01-02 18:46:05,940 - dvha - WARNING - StudyImporter.run: Skipping PTV related calculations. No PTV found for mrn: HNSCC-01-0214
2021-01-02 18:48:12,268 - dvha - ERROR - Unhandled exception: Traceback (most recent call last):
  File "DVH-Analytics\dvha\db\dicom_parser.py", line 453, in get_dvh_row
  File "DVH-Analytics\venv\Lib\site-packages\dicompylercore\dvhcalc.py", line 83, in get_dvh
  File "DVH-Analytics\venv\Lib\site-packages\dicompylercore\dicomparser.py", line 550, in GetStructureCoordinates
  File "DVH-Analytics\venv\Lib\site-packages\pydicom\dataset.py", line 783, in __getattr__
AttributeError: 'Dataset' object has no attribute 'ContourGeometricType'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "DVH-Analytics\dvha\main.py", line 1221, in Run
  File "threading.py", line 865, in run
  File "DVH-Analytics\dvha\models\import_dicom.py", line 1490, in import_target
  File "DVH-Analytics\dvha\models\import_dicom.py", line 1193, in __init__
  File "DVH-Analytics\dvha\models\import_dicom.py", line 1255, in run
  File "DVH-Analytics\dvha\db\dicom_parser.py", line 457, in get_dvh_row
  File "DVH-Analytics\venv\Lib\site-packages\dicompylercore\dvhcalc.py", line 83, in get_dvh
  File "DVH-Analytics\venv\Lib\site-packages\dicompylercore\dicomparser.py", line 550, in GetStructureCoordinates
  File "DVH-Analytics\venv\Lib\site-packages\pydicom\dataset.py", line 783, in __getattr__
AttributeError: 'Dataset' object has no attribute 'ContourGeometricType'

2021-01-05 11:27:11,415 - dvha - WARNING - StudyImporter.run: Skipping PTV related calculations. No PTV found for mrn: HNSCC-01-0099
2021-01-05 11:32:16,748 - dvha - WARNING - StudyImporter.run: Skipping PTV related calculations. No PTV found for mrn: HN-HMR-019
2021-01-05 11:36:19,979 - dvha - WARNING - StudyImporter.run: Skipping PTV related calculations. No PTV found for mrn: HN-HGJ-059
2021-01-05 11:39:06,787 - dvha - WARNING - D:\Dose3D\Dicom data\COOK\HEAD_AND_NECK\DYNAMIC_1\Dose\RD.1.2.246.352.71.7.787609392710.2543894.20200401103530.dcm: PatientID from DICOM-RT Dose file could not be matched to any DICOM-RT Plan
2021-01-05 11:39:06,787 - dvha - WARNING - D:\Dose3D\Dicom data\COOK\HEAD_AND_NECK\DYNAMIC_1\Dose\RD.1.2.246.352.71.7.787609392710.2544015.20200401103530.dcm: PatientID from DICOM-RT Dose file could not be matched to any DICOM-RT Plan
2021-01-05 11:39:06,787 - dvha - WARNING - D:\Dose3D\Dicom data\COOK\HEAD_AND_NECK\DYNAMIC_1\Dose\RD.1.2.246.352.71.7.787609392710.2547336.20200409143406.dcm: PatientID from DICOM-RT Dose file could not be matched to any DICOM-RT Plan
2021-01-05 11:39:06,787 - dvha - WARNING - D:\Dose3D\Dicom data\COOK\HEAD_AND_NECK\DYNAMIC_1\Dose\RD.1.2.246.352.71.7.787609392710.2547337.20200409143406.dcm: PatientID from DICOM-RT Dose file could not be matched to any DICOM-RT Plan
2021-01-05 11:39:06,787 - dvha - WARNING - D:\Dose3D\Dicom data\COOK\HEAD_AND_NECK\DYNAMIC_1\Dose\suma.dcm: PatientID from DICOM-RT Dose file could not be matched to any DICOM-RT Plan

Could you help me with importing this data? hnscc_import

cutright commented 3 years ago

Hi Mateusz,

Supporting anonymized DICOM files is difficult due to a very wide variation in anonymization scripts. When you say you've tried from different sources, were any of them actual patient DICOM files from commercial TPS's? If so, I'm very interested to see how they failed.

Concerning the errors above, I see three issues with these DICOM files: 1) NumberOfFractionsPlanned (300A,0078) is stored as a string 2) ContourGeometricType (3006,0042) is missing 3) The Dose files are so "anonymized", they can't be matched by DICOM tags

Number 1 That's an easy one-line fix that is now available in 0.9.2.

Number 2 @bastula Any thoughts on this one?

Number 3 This can probably be fixed with the "Pre-Process DICOM" button in the bottom right. Before hand, make sure each plan set (Plan, Structure, Dose) is in a directory with no other files. That button will ensure every file in each directory has the same StudyInstanceUID, and a unique StudyInstanceUID from all the directories in that run.

Long story short, I'll give the files from the link you posted a shot and see if any other issues pop up. I was already fairly close to releasing 0.9.2, but maybe these will be quick fixes. But I'm unsure about number 2.

Dan

cutright commented 3 years ago

@mateuszbaran Can you be more specific about which files failed above? I just imported Plan, Structure, and Dose for HNSCC-01-0001 with no issues.

mateuszbaran commented 3 years ago

Thank you for a quick response.

Supporting anonymized DICOM files is difficult due to a very wide variation in anonymization scripts. When you say you've tried from different sources, were any of them actual patient DICOM files from commercial TPS's? If so, I'm very interested to see how they failed.

No, none of them were actual patient files. I don't have access to non-anonymized DICOM files.

Number 3 This can probably be fixed with the "Pre-Process DICOM" button in the bottom right. Before hand, make sure each plan set (Plan, Structure, Dose) is in a directory with no other files. That button will ensure every file in each directory has the same StudyInstanceUID, and a unique StudyInstanceUID from all the directories in that run.

@mateuszbaran Can you be more specific about which files failed above? I just imported Plan, Structure, and Dose for HNSCC-01-0001 with no issues.

I tried again with a freshly downloaded HNSCC-01-0001 (previously I was working on a slightly preprocessed variant). On the first try apparently nothing was imported, so I decided to try again and this time nothing was shown in the "studies" tree. I tried removing local data (from the C:\Users\Mateusz\Apps folder) in case something was cached from my earlier attempts and now the program doesn't even start. What do I need to delete to get DVH Analytics back to a clean state?

cutright commented 3 years ago

I'm unsure what is happening. I just tried to reproduce this issue by deleting the contents of ~/Apps/dvh_analytics and opening DVHA, and again by deleting the directory ~/Apps/dvh_analytics itself with no issues. In both cases, DVHA created the directories and appropriate files. If you're running the MSW executable, this is the only location of local files (other than what PyInstaller's magic does, but that's unrelated).

By default, files will be moved to ~/Apps/dvh_analytics/data/imported unless you check "Leave files in inbox" in the top left of the import screen. That's probably why nothing showed up in your studies tree.

Are you sure nothing was imported, how did you verify? I'd be surprised if nothing was imported because it seems the files were moved.

mateuszbaran commented 3 years ago

I tried again and now the program runs, and I even managed to properly import HNSCC-01-0002. I still can't, however, import HNSCC-01-0001. I've attached screenshots of import windows from successful import of HNSCC-01-0002, unsuccessful attempt to import HNSCC-01-0001 and a query I performed later. I didn't touch the HNSCC-01-0001 files with any other programs and there are no logs that could indicate why no patients were found for HNSCC-01-0001.

I guess debugging DVH-Analytics is the only way to see why no patients were found?

hnscc-photon-2 hnscc-new hnscc-old

cutright commented 3 years ago

From your screen shot, you've queried for 'PHOTON' plans. But HNSCC-01-0001 is showing up as a 'PHOTON' and an 'ELECTRON' plan. You should be able to add another filter with ELECTRON to get both.

You can poke around your data by clicking on the Database icon in the toolbar.

Screen Shot 2021-01-05 at 3 09 44 PM
cutright commented 3 years ago

By the way, @mateuszbaran, you appear to have a few GitHub repos, so if you happen to be a familiar with python and SQL, I'd encourage you to check out the backend documentation at dvha.readthedocs.io.

Fair to close this issue? Happy to investigate a data set if you let me know which one that's failing for you (other than fraction number, that'll be fixed in v0.92 very shortly).

mateuszbaran commented 3 years ago

Thank you for your help, for now I don't need further assistance. I'm evaluating different options of viewing photon radiotherapy DICOM files. I'll definitely put some time into learning the backed of DVH Analytics if I decide to use it more.

cutright commented 3 years ago

I tried removing local data (from the C:\Users\Mateusz\Apps folder) in case something was cached from my earlier attempts and now the program doesn't even start. What do I need to delete to get DVH Analytics back to a clean state?

I think I've reproduced this error, but it didn't happen for me until I upgraded to wxPython 4.1.1. However, there is a requirement lock with wxPython>=4.0.4,<4.1.0 (except on the v0.9.3 branch which I just created yesterday). So maybe this is a different issue? I resolved it by downgrading wxPython to <4.1.0.

cutright commented 3 years ago

Traced the segmentation fault to this line: https://github.com/cutright/DVH-Analytics/blob/3292cd520f86f1d02657ed1840f94760cd5678c0/dvha/tools/errors.py#L85 Resolved in v0.9.3. Apparently, wxpython 4.1.1 doesn't like this call and wants you to use a style flag in MessageDialog for centering instead. https://discuss.wxpython.org/t/wx-messagedialog-center-causes-segmentation-fault-on-4-1-1/

mateuszbaran commented 3 years ago

Yes, I think I most likely experienced a different issue. Anyway, thanks for help :+1: .