Closed HadiHammoud44 closed 1 week ago
Dear hadi, as described in Table2 caption, the labels of val sets are unseen in pre-training. In most related SSL works, we find that they did not strictly exclude the val sets in pre-training. This is because in most cases, for datasets without available test sets, we have to adopt 5-fold validation. In this case, you cannot exlude val data in pre-training.
Please allow me to add a question. Do you use nnUNet's default 5-fold cross-validation? Or the one provided by the dataset? For example, BraTS21.
Please allow me to add a question. Do you use nnUNet's default 5-fold cross-validation? Or the one provided by the dataset? For example, BraTS21.
For most of the experiments, we use the splits the same as nnUNet. But for BraTS21, we use the one provided by the dataset, as shown in brats21_folds.json
Thanks for answering. I understood that you adopted 5-cross validation for finetuning, but do you report the mean and std among folds anywhere in the report or supplementary (if existed)? If not, how did you choose the values reported in the tables?
Dear hadi, We did not report the 5-fold results in the paper since there are many datasets and compared methods, which will make the tables too bulky and confusing. For most of the datasets, there are pre-defined splits for validation (e.g., word, brats, ct-rate .....) and we adopt the same settings for fair comparisons. For some datasets with test leaderboards (e.g., amos, flare23, kits ....), we report the results on test leaderboard (e.g., https://codalab.lisn.upsaclay.fr/competitions/12239#results). For other datasets, we set the same fold as nnunet for all the compared methods for fair comparisons.
Many thanks for your work. I was checking
data_utils_abdomen.py
and I realized that the validation set of some datasets like BTCV, amos, ... (that are later used for finetuning) is being used while pretraining. Could you please clarify?(table1 of the paper says that only the training set of BTCV is used, which has size 24)