anthonyweidai / SvANet

Official implementation of "Exploiting Scale-Variant Attention for Segmenting Small Medical Objects"
MIT License
37 stars 2 forks source link

Couple of questions about the paper. #9

Closed aymuos15 closed 1 month ago

aymuos15 commented 1 month ago
  1. For nnUNet, did you use the resnet version?
  2. Why use Cross-Entropy as the loss? -- This I suppose adversely affects most of the networks.
  3. Are there any ablations based on the loss?
  4. Why are there no overlap metrics reported?
  5. I also think testing on the BraTS-Mets dataset would be a nice test.

Also, Minor typo: 'of training except that nnUNet utilized official settings [13] for training.' Should be: 'of training except that of nnUNet utilized official settings [13] for training.'

Thanks!

anthonyweidai commented 1 month ago
  1. For nnUNet, did you use the resnet version?
  2. Why use Cross-Entropy as the loss? -- This I suppose adversely affects most of the networks.
  3. Are there any ablations based on the loss?
  4. Why are there no overlap metrics reported?
  5. I also think testing on the BraTS-Mets dataset would be a nice test.

Also, Minor typo: 'of training except that nnUNet utilized official settings [13] for training.' Should be: 'of training except that of nnUNet utilized official settings [13] for training.'

Thanks!

Thank you for your interest on our work. I will address your questions one by one:

  1. No, as far as I know, nnUNet adapted the UNet design and made several modifications in data augmentation, architecture, loss function, and other training and evaluation techniques, resulting in a strict structure for their codes. I ran their code on my datasets directly.
  2. Other loss functions could be used. Cross-entropy loss is the most common one. Since our work's contribution is not on the loss algorithm, we are not strict about it.
  3. Using a combination of several existing loss functions can improve performance but may reduce efficiency. For example, nnUNet used a mingled loss of Dice and Cross-entropy.
  4. Area-based metrics such as mIoU and mDice were calculated in our work.
  5. It would be interesting to validate our method on BraTS. Due to page length and time limitations, we haven't tested yet. However, we have tested our methods on several other datasets not reported in the manuscript, and the result support the paper's conclusions.

If I was incorrect about the "overlap metrics" or other points, could you please provide more information?

aymuos15 commented 1 month ago
  1. I was talking about this -- ResNet Presets

2/3. I would argue saying Dice + CE is the most common one. I do agree about not being strict about the loss, but generally segmentation tasks do well with Dice + CE. Obviously it may show an increase for your model as well, but at least to me, Dice + CE would be a fairer evaluation. Just to go on your point, efficiency isn't a discussed objective in your paper either right?

  1. Apologies, I meant to say boundary metrics (Like HD95).

  2. Sounds good!

Really appreciate the time taken to answer the questions!

anthonyweidai commented 1 month ago
  1. I was talking about this -- ResNet Presets

2/3. I would argue saying Dice + CE is the most common one. I do agree about not being strict about the loss, but generally segmentation tasks do well with Dice + CE. Obviously it may show an increase for your model as well, but at least to me, Dice + CE would be a fairer evaluation. Just to go on your point, efficiency isn't a discussed objective in your paper either right?

  1. Apologies, I meant to say boundary metrics (Like HD95).
  2. Sounds good!

Really appreciate the time taken to answer the questions!

Thanks for your reply. We used nnU-Net (org.) from thier repo. For the loss function, it's worth trying Dice + CE during training. We plan to implement more metrics in the future work. We noticed that several online implemetations of metrics (e.g., HD95) don't support vectors with batch dimensions well, leading to slow validation if performed every epoch.

aymuos15 commented 1 month ago

I think you should use the resnet presets. They offer better scores. See this discussion about a different paper on some of the few challenge datasets -- https://openreview.net/forum?id=qmN9v3O69J

And yes, that's fair. I was only talking about test metrics.

Thank you for answering all the questions and great work! Really interesting read:)