LIDC-IDRI preprocessing steps

ymli39 / DeepSEED-3D-ConvNets-for-Pulmonary-Nodule-Detection

DeepSEED: 3D Squeeze-and-Excitation Encoder-Decoder ConvNets for Pulmonary Nodule Detection

MIT License

106 stars 32 forks source link

LIDC-IDRI preprocessing steps #50

Open sajidalirander opened 1 year ago

sajidalirander commented 1 year ago

Hi @ymli39,

Thanks for sharing your work.

I have reproduced the results for LUNA16 dataset. I am working to do the same with LIDC-IDRI dataset.

I have encountered a problem and need some clarification.

I am using prepareLIDC.py file. It takes three inputs: (1) preprocess path, (2) data_path, (3) new_nodule csv file path.

new_nodule.csv contains patients nodules until LIDC-127 and multiple slices. When I run the script, xx, yy, and zz will be zero for other patients ID, i.e., beyond LIDC-127. Another question is that how did we know which slice has nodule with new_nodule.csv file alone?

I also explored new_non_nodule.csv which also have the same patients ID and until LIDC-127. It does not have slice information as well.

Another question is what was the data structure? I have as follows: LICD-IDRI-Data

LICD-IDRI-0001
- dicom files
LICD-IDRI-0002
- dicom files (...)

Thank you for your time and effort.

ymli39 commented 1 year ago

Hi @sajidalirander, LIDC continas a subset of CT scans with slice thickness>=5mm and CT scans from LUNA dataset, I remember I excluded those LUNA scans from LIDC task. That's probably the reason you only see LIDC up to ID127. You can find the corresponding LIDC id from sid2csv.csv. Unfortunately, I don't recall the data structure since it has been several years.

sajidalirander commented 1 year ago

Thank you for the response. Could you provide details about extracting the mask without knowing the slice number?

ymli39 commented 11 months ago

Sorry about taking some time to get back to you. I remember LIDC has the xml file online, it shows the location (x,y,z,d) of the nodule per scan. I extracted the info based on those files.

shweta2266 commented 7 months ago

Hi @sajidalirander,

Can you please share the details of how you reproduced the results using luna dataset. I tried the same pipeline. The training was okay using luna_train.npy and I got around 97% tpr and 99% tnr. However, when I tested on test cases mentioned in luna_test.npy, I did not get any predicted bounding boxes with threshold -3. I also tried with different threshold, but still not getting any bounding box.

Hi @ymli39, can you please look into this and provide some help?

Thanks and regards Shweta

sajidalirander commented 7 months ago

Hi, @shweta2266

Is the evaluation script saving the bounding box files as empty?

Try the following debugging steps:

Please check the test.npy file. Make the samples code in 3 digits.
Check if the right configuration files is loaded.

If nothing works, please share the script you are working with.

shweta2266 commented 6 months ago

Hi @sajidalirander,

Thank you so much for your response. I could reproduce the results.