Pytorch nnUnet How do you ensure consistency while comparing to other models ?

NVIDIA / DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

13.25k stars 3.17k forks source link

Pytorch nnUnet How do you ensure consistency while comparing to other models ? #1106

Closed ekurtulus closed 2 years ago

ekurtulus commented 2 years ago

Related to Model/Framework(s) Pytorch/nnUnet

Describe the bug As far as I can see, you use Scikit-learn's Kfold splitter: https://github.com/NVIDIA/DeepLearningExamples/blob/db06ff533bf96fc256ce595c171eedae18f7f3ba/PyTorch/Segmentation/nnUNet/data_loading/data_module.py#L84-L85 with random_state=12345. How do you ensure that this train / validation split is exactly the same as the studies you compare to in your paper i.e. nnUnet and UNETR ?

michal2409 commented 2 years ago

You should use 5 fold cross validation i.e., train your model with --fold i for i in {0,1,2,3,4} and compute the mean of the dice scores.

ekurtulus commented 2 years ago

The problem is that when each study uses different seed for random splitting, the reported results are on different splits regardless of how many number of splits are used. How is this prevented ?

michal2409 commented 2 years ago

DKFZ implementation is also using random_state=12345 https://github.com/MIC-DKFZ/nnUNet/blob/6844361bb1dd60efb5f35112e248cf377902cd53/nnunet/training/network_training/nnUNetTrainerV2.py#L296.

michal2409 commented 2 years ago

However, all reported experiments in our paper were run by us. Thus all of them were run on the same splits with random_state=12345.

ekurtulus commented 2 years ago

Okay, thanks for the clarification.