Closed shinpaul14 closed 7 months ago
Hi @shinpaul14 Thanks for your feedback, We have incorporated the remaining code into the repository and included training instructions in the README. If you have further questions, feel free to let us know.
Access to the annotations csv (or preprocessing code to generate the csv) will be released shorty. We will provide further updates on this matter within this issue. Best,
UPDATE: The annotations CSV generation instructions is added to the README.
Hello @Amiiney
Thank you for your help. I have another question, in the current provided code base the results that is reproducible is SwinT and SwinT+SelfD?
Hello @Amiiney
I have a question about the last linear layer of the model.
`
self.model = timm.create_model(model_name, pretrained=pretrained)
# Get the number features in final embedding
n_features = self.model.head.in_features
# Update the classification layer with our custom target size
self.model.head = nn.Linear(n_features, CFG.target_size)
`
In this code when I ran this code I received the error that the model output shape is [64,7,7,131].
Hi @shinpaul14,
1- You can reproduce all the experiments expect the ensemble because it needs the phase model. You can reproduce them using the parameters target_size
(=100 uses only the triplet information, =131 uses 100 tripelts + 31 individual instrument, verb and target)
SwinT: python main.py target_size=100 epochs=20 distill=false exp=teacher
SwinT + MultiT: python main.py target_size=131 epochs=20 distill=false exp=teacher_multi
SwinT + SelfD: python main.py target_size=100 epochs=40 distill=true exp=student
(You need to generate the softlabels first, see readme).
SwinT + MultiT + SelfD: python main.py target_size=131 epochs=40 distill=true exp=student_multi
2- We are importing the Swin transformer from the timm library that has made some modifications to the model in the newer versions. Make sure to downgrade timm to the version mentioned in requirements.txt timm==0.6.5
, this should solve the problem.
Thank you for your help @Amiiney .
Then, how long did it take to train the teacher and student model?
The training of the teacher model takes around 15 hours and the student model 30 hours on an RTX3090 GPU
Thank you for your reply.
With the current updated code, I wasn't able to reproduce the teacher's performance.
I ran python main.py target_size=100 epochs=20 distill=false exp=SwinT
in this code.
The change I've made was Python 3.7 -> 3.9, a different torch version.
And I have a question about the mAP and CholecT45 mAP and why there performance between them is different??
I am suspecting that there was some change in the dataset CholecT45.csv file that shifted the columns, Did you modify the csv file? Can you pull the newest version and parse the CholecT45.csv again.
mAP is the overall score per fold without aggregation, cmAP (challenge mAP) is the aggregation per video that was used in the cholectriplet2022 challenge. Thanks for your feedback, we added this information to the printing.
When you upload the pre-trained weight can you also upload the training logs?
With the current uploaded code when I try to reproduce the results I get my model overfitted, where the training loss is reduced but validation loss increases.
What could be the possible problem causing overfitting when I try to reproduce?
You are correctly reproducing the code! Indeed, the teacher model is overfitting; however, this is not an issue as the weights are saved at the best epoch (in your case: epoch 2) and the main purpose of the teacher model is to generate soft-labels.
After generating the soft labels and training the student model, you should observe a more stable validation loss and increased mAP scores. The behavior of the validation loss is related to the characteristics of the dataset, which includes 100 classes with significant class imbalance. In the Cholectriplet22 challenge, we optimized for the mean average precision metric, not the validation loss. The key difference is that the loss is dominated by the majority classes, while the mAP metric weights all classes equally.
Hello, based on reading the code and the Miccai 2023 paper. I was wondering if this code is reproducible, where it seems like it is missing several parts of the code.