Open Qitingshe opened 4 months ago
I'm still working on the inference pipeline, but at the moment, it appears that the model only achieves a poor CPM about 36. I suspect this is due to the low IoU score on small nodules and the simplified end-to-end modelling. I'm exploring options to enhance its performance.
I'm still working on the inference pipeline, but at the moment, it appears that the model only achieves a poor CPM about 36. I suspect this is due to the low IoU score on small nodules and the simplified end-to-end modelling. I'm exploring options to enhance its performance.
Maybe try some multi-scale solution like https://pubmed.ncbi.nlm.nih.gov/36544067/
And I find someone use Swin Transformer to detect nodule,https://arxiv.org/pdf/2310.03365.pdf
Hello, do most research use FROC? Which should we use to compare, FROC or ROCAUC?
I was able to retrain the whole thing from scratch, just tweaking the hyper-parameters slightly. I have also used fixed seed for the final evaluation example like the snippet below:
def set_seed(seed):
torch.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
np.random.seed(seed)
random.seed(seed)
set_seed(41)
The results that I got seems pretty great! With F1 of ~94% and ROC of ~98%. But the IOU was only around ~31%. For classification, seems like it is really good at classifying a nodule vs non-nodule patch, but pls correct me if i am wrong
Hello, do most research use FROC? Which should we use to compare, FROC or ROCAUC?
I was able to retrain the whole thing from scratch, just tweaking the hyper-parameters slightly. I have also used fixed seed for the final evaluation example like the snippet below:
def set_seed(seed): torch.manual_seed(seed) torch.cuda.manual_seed_all(seed) torch.backends.cudnn.deterministic = True torch.backends.cudnn.benchmark = False np.random.seed(seed) random.seed(seed) set_seed(41)
The results that I got seems pretty great! With F1 of ~94% and ROC of ~98%. But the IOU was only around ~31%. For classification, seems like it is really good at classifying a nodule vs non-nodule patch, but pls correct me if i am wrong
Hello, FROC better reflects the performance in practice, as it evaluates both classification and location accuracy at the scan level. A candidate is counted as a true positive only if its coordinates are within one radius of a ground truth. The FROC score can be obtained using the inference pipeline in eval.py
and the official evaluation scripts in evaluationScript
.
Despite the high classification performance on scan patches, the vanilla ViT struggles to accurately locate nodules, especially small ones, resulting in a low IoU score and consequently a low FROC score. As mentioned above, a multi-scale solution could be beneficial in addressing this issue. I'm reconsidering my approach.
Thank you for reproducing the model and pointing out the issues.
Can you give the FROC on LUNA16?