Closed nengwp closed 2 years ago
Dear @nengwp ,
nnDetection does not include the design of a false positive reduction stage since it was out of scope of the initial publication and I'm not sure how much performance can be gained by it due to two reasons: 1) I'm not sure if a FPR network can learn features which can not be learned during the training of the detection network (it doesn't contain new information) 2) Designing task specific FPR stages (e.g. using the Projection of the Proposals) is not suitable for the scope of nnDetection since we expect them to work on many datasets.
Of course, you could simply train your own FPR stage and use the predictions generated by nnDetection to train/inference with them.
Best, Michael
Thank you very much for your quick reply
This request is made because I found a large number of false positive predictions in the inference analysis report.
A large number of false positives are included in iou_0.1_score_0.1, and a large number of false positives are also found in iou_0.5_score_0.5. iou_0.5_score_0.5 also brings an increase in false negatives.
Is it possible for me to improve the results without additional FPR components?
Hi @nengwp ,
1) the analysis reports are only intended for visual inspection purposes and are not really designed to give you a comprehensive impression of the performance. The model is not calibrated and the cutoff for the confidence score (or "probability" of the network) are chosen arbitrarily. Increasing the score threshold will always add more false negatives and reduce FP, this is equivalent to a normal classifier and an ROC evaluation. In order to gain a better intuition of the performance, I would suggest looking at the FROC curve and checking which working point is acceptable for your application (This usually depends on the underlying diagnostic goal, e.g. screening requires high sensitivity). Based on the validation set, choose the confidence score threshold to match your clinical need. In my experience, the FROC evaluation also gives a somewhat pessimistic estimate and when looking at the FP predictions most of them are corner cases which are still interesting to look at.
2) The current set of parameter worked well over the datasets we looked at in our paper, but certain components are still difficult to include in rules and a dataset specific finetuning can usually improve the results to a certain degree (e.g. some datasets are more heterogeneous than other and thus benefit from more augmentation). A task specific FPR implementation might be helpful but requires additional work and tuning.
Best, Michael
Hi @mibaumgartner
Thank you very much for your answer, I will close this question.
Based on the validation set, choose the confidence score threshold to match your clinical need. In my experience, the FROC evaluation also gives a somewhat pessimistic estimate and when looking at the FP predictions most of them are corner cases which are still interesting to look at.
Hello, thank you for this clear hints. Can you help me finding theses threshold so that I can choose one for deployment please. Thank you in advance, best, Thibault
:question: Question
False positive reduction is very important for object detection, such as applying FPR on the LUNA16 dataset.
So can I add this component? So what should I do?