MIC-DKFZ / nnUNet

Apache License 2.0
5.8k stars 1.74k forks source link

Anyone investigated prediction performance of 5-fold versus 1-fold, with tta and without? #2224

Open mark-joe opened 5 months ago

mark-joe commented 5 months ago

Not an issue, but an inquiry whether someone has done some research in the respective possible inference options that nnU-Net offer: 1 or 5 folds and tta/no_tta. Thus far, we run nnU-Net in the most costly setting: 5 fold + tta (full 3d) and takes 246 seconds for a particular experiment. We have plans to run in an online setting, and all time we can spare is beneficial. Shortest run can be obtained by running 1 fold, with disabled tta: 25 seconds. That is quite a reduction. Both other options are about one minute. Has anyone looked into this matter before? And what are your findings in possible changes in performance? The problem with my experiments is that the ground truth is a bit fuzzy. I cannot state that one voxel more or less in the prediction is better or worse. Network output is probably as good as the training/test labels, so for me it is actually quite hard to compare different settings. Well, hope somebody looked into this.

mrokuss commented 5 months ago

Hey @mark-joe

Generally this is a typical issue one faces when it comes to deploying nnUNet in lower resource conditions. From my experience, rather switch off TTA than using only 1 fold. That way you leverage at least all the data. For larger images you can also try increasing the tile_step in the predictor from 0.5 to 0.7ish. In the end it comes down to how much compute you can spare an how much performance you want to find your golden ratio.

Hope this helps!