wenxi-yue / SurgicalSAM

[AAAI2024] Official implementation of SurgicalSAM
MIT License
70 stars 9 forks source link

Can not achieve the same training performance in endovis_2018 too #20

Open ggzzhhhh opened 3 months ago

ggzzhhhh commented 3 months ago

I've trained several times. However, the Best Challenge IoU can only reach 78.0220. Here is the log file log.txt

ggzzhhhh commented 3 months ago

Hardware and software information: image image image image image

wenxi-yue commented 3 months ago

Hi,

We think that the performance discrepancy might be due to variations in software versions, hardware, or dependencies, which can affect the augmented data and training process. Our details are:

GPU: NVIDIA V100 16GB Operating System: Ubuntu 20.04.3 Conda Environment Packages and Versions: image In our experiment, we did not try the GPU settings in your case.

We have also included our training log for your reference: log_endovis2018.txt.

ggzzhhhh commented 3 months ago

Thank you for the reply. I have another question: I have trained several times and the results are not the same, even the seed is fixed. Has the author encountered this situation? First time: Validation - Epoch: 5/499; IoU_Results: {'challengIoU': 47.901, 'IoU': 47.901, 'mcIoU': 29.717, 'mIoU': 44.871, 'cIoU_per_class': [48.219, 19.663, 44.317, 59.19, 14.74, 19.386, 2.504]} Second time: Validation - Epoch: 5/499; IoU_Results: {'challengIoU': 44.407, 'IoU': 44.407, 'mcIoU': 27.575, 'mIoU': 43.86, 'cIoU_per_class': [41.774, 24.628, 39.165, 57.744, 20.577, 7.29, 1.846]} Third time: Validation - Epoch: 5/499; IoU_Results: {'challengIoU': 56.588, 'IoU': 56.588, 'mcIoU': 36.482, 'mIoU': 52.821, 'cIoU_per_class': [64.46, 30.028, 59.921, 58.788, 20.341, 19.155, 2.683]}

wenxi-yue commented 2 months ago

Hi,

This is also the case on our end. We have fixed the seed and set deterministic to True, but it seems that there are still other factors that have lead to non-deterministic behavior in model training.

However, we tried different seeds and obtained reasonably stable results, as shown in Table 7 of our paper (arXiv version).