mercure-imaging / MAP-monaiclassify

3 stars 0 forks source link

Unable to see bounding boxes on outputs #2

Closed junxiant closed 4 months ago

junxiant commented 4 months ago

Hello, I ran the script as follows,

 python lung_app -i ./inputs -o ./outputs -m ./models/model/lung_model.ts

and i got the results with 4 nodules. The logs showed no error, and printed out the following text.

Number of nodules: 4
Confident Boxes:  [[-128.42869567871094, -175.2610321044922, -299.06256103515625, 6.0030517578125, 6.03265380859375, 6.080352783203125], [103.94212341308594, -211.94577026367188, -227.51504516601562, 4.128715515136719, 4.1184539794921875, 3.9498443603515625], [81.47509002685547, -109.63416290283203, -123.52464294433594, 4.14727783203125, 4.1520843505859375, 3.964019775390625], [105.86669921875, -267.88201904296875, -163.27650451660156, 7.089256286621094, 7.131011962890625, 7.061492919921875]]

But when i tried to load the output data onto 3D Slicer to view the CT Scan, it does not show the bounding boxes. Can you help me out? The sample data i used was from LUNA16 (The resampled one from monai zoo link), after converting it into DICOM format.

I have included the input file here (1.3.6.1.4.1.14519.5.2.1.6279.6001.100225287222365663678666836860) : https://drive.google.com/file/d/11htVlWMg7OG1oyWi8sUhUl-pF96tK8bX/view?usp=sharing

Output file after running the model: https://drive.google.com/file/d/1pGCivZl3SorHrEF4V4QqOf6c5IWupXvn/view?usp=sharing

Aman-saimbhi commented 4 months ago

Hello,

I looked at the data you shared and am unsure about the nodule count. Do you have access to the ground truth value by any chance? I checked the DICOM tags, and the patient ID does not match any of the ones in the official nodules count list.

Nodules count link: https://www.cancerimagingarchive.net/wp-content/uploads/LIDC-XML-only.zip

Also, it would be great if you could test if you can visualize the nodules using the predicted coordinates following this tutorial as you are using Slicer already:

https://github.com/Project-MONAI/tutorials/tree/main/detection/luna16_visualization

junxiant commented 4 months ago

Sorry i mixed up the data. I tested out another and it found 1 nodule, but the bounding box is off. Here is how it looked like on the 3D Slicer: image

The file name of the original LUNA16 scan (not resampled): 1.3.6.1.4.1.14519.5.2.1.6279.6001.105756658031515062000744821260.raw this is from subset0 https://drive.google.com/file/d/1SY4YyLy1cxJ32ncWy5C2JoxBU8RzkbVj/view?usp=sharing

I uploaded this into 3D Slicer, exported as DICOM, used it as input: https://drive.google.com/file/d/1RUrReoov1RydfQA6lsXYIoLyMzgQRl42/view?usp=sharing

Then the outputs i got https://drive.google.com/file/d/1lcoLAI0gWlMOJ4ah6OlMuMC0pci4FXKF/view?usp=sharing

I will also share the log output in a txt file log.txt

I think when the scan gets exported as DCM in 3D Slicer, the series instance uid changes. The output also has a different series instance uid from the input as well?

In the candidates_V2.csv i was able to find rows with the file name of the original .raw file image

However, in the annotations.csv, it does not exist. I will try out the nodule visualization with the coords manual input later.

junxiant commented 4 months ago

Let me provide another scan from LUNA16, this time it exists in annotations.csv: image

It found 4 nodules but i think one of the bounding box is wrongly predicted image

image

image

image

1.3.6.1.4.1.14519.5.2.1.6279.6001.100225287222365663678666836860.raw RAW + MHD https://drive.google.com/file/d/1vXz6JRdk682OKd3esXcE-C7kbIh8MWM3/view?usp=sharing

The input file after using 3D Slicer "Export DCM" function: https://drive.google.com/file/d/1y-Dg-l6O24rmpGUSHoI_f0sroIESSQk8/view?usp=sharing

The output: https://drive.google.com/file/d/1jeREt-eRTSAjwSw2DQ2_h4meDWyUn9XL/view?usp=sharing

And the log outputs in the txt file: logs_1.3.6.1.4.1.14519.5.2.1.6279.6001.100225287222365663678666836860.txt

I hope these are enough info provided.

Maybe the issue is the resampled data? I downloaded the resampled data from the monai zoo link https://monai.io/model-zoo.html

The 1.3.6.1.4.1.14519.5.2.1.6279.6001.100225287222365663678666836860.nii.gz file, went through the same process, converted to DCM with 3D Slicer, then ran the inference, and imported the output but there is no bounding box shown.

Aman-saimbhi commented 4 months ago

I went through all the information you shared. It can be seen in the annotations.csv file, there are 2 nodules. The module is predicting 2 extra nodules with high scores along with the actual ones.

We are only providing this MAP as a template based on the MONAI model and tutorial documentation. I will not be able to help with any lack of generalization of the model - or any other error in model performance. We are not providing a trained model for clinical or research purposes, just showing how you can integrate an open-source Monai model into Mercure.

With that being said- for the 2 correctly predicted nodules, the predictions are very close to what is there in the annotations file. The corresponding two bounding boxes also appear to be valid. If you want to verify the exact position of the 2 correct nodules I would suggest you use the coordinates predicted by the model to create a 3D mesh as outlined in this tutorial:

https://github.com/Project-MONAI/tutorials/tree/main/detection/luna16_visualization

Adding bounding boxes for a 3D classification having world coordinates is tricky and thus I can expect some error in the code. It would be great if you could share the result of the above exploration in case it differs from what we are currently seeing.

For the export part, Slicer does seem to change many of the DICOM tags when used for exporting to a different format. You can directly download the image data if you want to avoid the exporting step. Also, the code is taking care of the resampling part by applying a pre-transform function to the image data by default. So, I believe that could be the issue with using the resampled data as the starting point.

https://www.cancerimagingarchive.net/collection/lidc-idri/

junxiant commented 4 months ago

Also, the code is taking care of the resampling part by applying a pre-transform function to the image data by default. So, I believe that could be the issue with using the resampled data as the starting point.

Got it, i think that clears up a lot. It could be doing resampling on resampled data, causing this error.

Maybe increasing the "score" variable threshold might help reduce False Positives. I'll look into the visualization some time later. Closing this issue for now, thanks for all the help