MIC-DKFZ / LIDC-IDRI-processing

Scripts for the preprocessing of LIDC-IDRI data
MIT License
75 stars 18 forks source link

How long it is to convert whole LIDC dataset to nrrd ? #4

Closed ivanwilliammd closed 5 years ago

ivanwilliammd commented 5 years ago

Hello Sir Michael, it is almost 48 hours since I start converting LIDC-IDRI datasets (1010 patients) into nrrd (nifty) file before preprocessed again into numpy by preprocessing.py at medicaldetectiontoolkit repo. However, it still load 60% of the progress For my reference when preprocess my private dataset, do you have any solution or code to add in order to enchance the conversion using CUDA (Note: I am planning to reconstruct my database using Sir Michael code Is it possible to enchance the speed of conversion using CUDA core, since you are using 3rd party MITK software to convert it to nrrd (nifty) file?

Thank you Sir

PC Used:
Intel i7-7700, 16GB RAM, NVIDIA GTX1050Ti 4GB
MiGoetz commented 5 years ago

I cannot exactly tell you, but the code isn't written with performance in mind. I think it took me about 1 day?

I think including some CUDA-cores wouldn't significantly speed up the progress. There is not a single, computing-intensive part of the programm, rather the simple programming (with a different focus than speed) is the main reason.

For example, in the XML files, there is no direct link to the corresponding DICOM files, instead those are searched using GLOB, which makes it quite slow. Another reason are the sheer number of patients, there are more than 1000 Patients, so if the processing of each patient takes more than 2 minutes, it already adds up to more than a day.

However, I am happy to include any suggestions for speed-up.

ivanwilliammd commented 5 years ago

Thank you Sir Michael for your reference, I finished processing all LIDC data around 3,5days on my PC i7-7700, 16GB RAM