ymli39 / DeepSEED-3D-ConvNets-for-Pulmonary-Nodule-Detection

DeepSEED: 3D Squeeze-and-Excitation Encoder-Decoder ConvNets for Pulmonary Nodule Detection
MIT License
109 stars 33 forks source link

Trained model predicts high number of bboxes. #27

Closed MjdMahasneh closed 3 years ago

MjdMahasneh commented 3 years ago

Hello @shakjm, I hope you're well.

since @ymli39 is not interacting with new issues for some reason (I hope he is well and I hope to hear back from him soon), and since I have noticed that you have been actively interested in this repo, I am hopeful that you may be able to assist me with some questions I have.

I am running the training script using Python 3.5 (and LUNA-16 dataset) and have been trying to reproduce the results from the paper, for some reason, the model is producing a very high number of bounding boxes predictions per scan (~1 million or more sometimes). Have you come across this issue before? I am not sure why this is happening. Obviously, this also is slowing the evaluation drastically (non-maximum suppression becomes extremely slow).

Could this be the difference in python versions (although I have checked the code for inconsistencies between python 2 and 3, namely, I have checked all division lines in all scripts to make sure they are consistent).

Another thing I have noticed is the use of thresh=-3 in train_detector_se.py, which I can't quite understand it clearly, it seems to me that it's being used for producing a mask...!

I apologize but this is a desperate call for help, I have been working on this for a good while now, so I would really appreciate any advice or assistance :)

Many thanks in advance.

ymli39 commented 3 years ago

Sorry about late response, I have not been checking this git for a while because I moved on to other project. For generating too many predictions, there is a threshold you could tune to make it generate less, it is in folder "LIDC_detector/FROCeval.py" named nmsthresh and detp, you could make those bigger to save computational cost. However I believe it also takes a long time from my server to generate those candidates (but not as much as 1 million) The lower those numbers you set, the more candidates will be used for calculation.

Those large numbers of candidates may cause your confusion, but when I checked step by step for FROC code released from LUNA website, the way they calculate the sensitivity is only count the true positivies and the false positive does not add any impact on sensitivity. That why the methods, not just limited to mine, but also for the methods I have been comparing from my paper, tends to generate huge numbers of candidates to detect as much as true positives as possible.

thresh = -3 is used because we directly outputted logits (meaning before passing through Sigmoid function), so that's why it is a negative number.

MjdMahasneh commented 3 years ago

Thank you for your detailed response, it's very much appreciated.