Closed smallkaka closed 1 year ago
Hi @smallkaka
This is because the indeterministic algorithm used in PyTorch. For more details plz refer to PyTorch Reproducibility.
To be brief, when using --benchmark
flag during training, CUDNN is enabled to accelerate the convolution operations, but under such cases, the convolution will be indeterministic. Disabling --benchmark
flag may help improve the reproducibility, but still, some other operations are not deterministic as well, such as trilinear upsamling.
Overall, indeterministic algorithms used in PyTorch will inevitably affect the reproduction results, this is normal. Using the same PyTorch version as described in the paper and same hardware setting (DDP on 2 NVIDIA RTX 3090 GPUs) may help you get the closest result to our paper, but complete reproduction is still not guaranteed, which has been claimed in PyTorch documentation:
Completely reproducible results are not guaranteed across PyTorch releases, individual commits, or different platforms. Furthermore, results may not be reproducible between CPU and GPU executions, even when using identical seeds.
OK.Thank you for your reply.
Author, thank you for your work. But when I reproduced the code, I set the negative samples to N=100, but there is still a little gap between my results and your paper. LA dataset: bg_dice: 0.9896 ± 0.0049; la_dice: 0.8712 ± 0.0577; bg_hd95: 2.2933 ± 1.3194; la_hd95: 15.1779 ± 16.0119; bg_asd: 0.4185 ± 0.2263; la_asd: 3.6897 ± 3.3981; Pancreas dataset: bg_dice: 0.996 ± 0.0015; pancreas_dice: 0.7719 ± 0.0702; bg_hd95: 1.2585 ± 0.4075; pancreas_hd95: 12.1445 ± 10.8312; bg_asd: 0.2605 ± 0.1133; pancreas_asd: 2.8957 ± 1.3297; Here is my training command: CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node=2 train.py --mixed --benchmark --task pancreas --exp_name pancreas --wandb paper result: