Anyone investigated prediction performance of 5-fold versus 1-fold, with tta and without?

Not an issue, but an inquiry whether someone has done some research in the respective possible inference options that nnU-Net offer: 1 or 5 folds and tta/no_tta. Thus far, we run nnU-Net in the most costly setting: 5 fold + tta (full 3d) and takes 246 seconds for a particular experiment. We have plans to run in an online setting, and all time we can spare is beneficial. Shortest run can be obtained by running 1 fold, with disabled tta: 25 seconds. That is quite a reduction. Both other options are about one minute. Has anyone looked into this matter before? And what are your findings in possible changes in performance? The problem with my experiments is that the ground truth is a bit fuzzy. I cannot state that one voxel more or less in the prediction is better or worse. Network output is probably as good as the training/test labels, so for me it is actually quite hard to compare different settings. Well, hope somebody looked into this.

MIC-DKFZ / nnUNet

Anyone investigated prediction performance of 5-fold versus 1-fold, with tta and without? #2224