LUNA16 FROC eval - Githubissues

Qitingshe commented 4 months ago

Can you give the FROC on LUNA16?

rlsn commented 4 months ago

I'm still working on the inference pipeline, but at the moment, it appears that the model only achieves a poor CPM about 36. I suspect this is due to the low IoU score on small nodules and the simplified end-to-end modelling. I'm exploring options to enhance its performance.

Qitingshe commented 4 months ago

I'm still working on the inference pipeline, but at the moment, it appears that the model only achieves a poor CPM about 36. I suspect this is due to the low IoU score on small nodules and the simplified end-to-end modelling. I'm exploring options to enhance its performance.

Maybe try some multi-scale solution like https://pubmed.ncbi.nlm.nih.gov/36544067/

And I find someone use Swin Transformer to detect nodule，https://arxiv.org/pdf/2310.03365.pdf

junxiant commented 4 months ago

Hello, do most research use FROC? Which should we use to compare, FROC or ROCAUC?

I was able to retrain the whole thing from scratch, just tweaking the hyper-parameters slightly. I have also used fixed seed for the final evaluation example like the snippet below:

def set_seed(seed):
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
    np.random.seed(seed)
    random.seed(seed)

set_seed(41)

The results that I got seems pretty great! With F1 of ~94% and ROC of ~98%. But the IOU was only around ~31%. For classification, seems like it is really good at classifying a nodule vs non-nodule patch, but pls correct me if i am wrong

rlsn commented 4 months ago

Hello, do most research use FROC? Which should we use to compare, FROC or ROCAUC?

I was able to retrain the whole thing from scratch, just tweaking the hyper-parameters slightly. I have also used fixed seed for the final evaluation example like the snippet below:
def set_seed(seed):
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
    np.random.seed(seed)
    random.seed(seed)

set_seed(41)
The results that I got seems pretty great! With F1 of ~94% and ROC of ~98%. But the IOU was only around ~31%. For classification, seems like it is really good at classifying a nodule vs non-nodule patch, but pls correct me if i am wrong

Hello, FROC better reflects the performance in practice, as it evaluates both classification and location accuracy at the scan level. A candidate is counted as a true positive only if its coordinates are within one radius of a ground truth. The FROC score can be obtained using the inference pipeline in eval.py and the official evaluation scripts in evaluationScript.

Despite the high classification performance on scan patches, the vanilla ViT struggles to accurately locate nodules, especially small ones, resulting in a low IoU score and consequently a low FROC score. As mentioned above, a multi-scale solution could be beneficial in addressing this issue. I'm reconsidering my approach.

Thank you for reproducing the model and pointing out the issues.

rlsn / LungNoduleDetection

LUNA16 FROC eval #2