Training and Inference discussion for baseline model

rohanbanerjee commented 7 months ago

What is the baseline model

The model which was trained on ✅ as per the QCs mentioned in #25 is the baseline model. A total of 96 images were used in the training of this model. A list of subjects (for later reference) is below: participants_baseline.csv

The model was trained in 6 different settings - 5 models for the 5 fold cross-validation and 1 fold_all model (discussion can be found here - https://github.com/MIC-DKFZ/nnUNet/issues/1364#issuecomment-1492075312). The config (containing preprocessing, hyperparameters) for nnUNetv2 training is: plans.json

After the model was trained, the inference was run on the ❌ (failed segmentations) from #25. Below are the QCs from all the folds and fold_all:

qc_fold_0.zip qc_fold_1.zip qc_fold_2.zip qc_fold_3.zip qc_fold_4.zip qc_fold_all.zip

The steps to reproduce the above QC results (/run inference) are the following:

Clone this repos
cd fmri-segmentation
Download the model weights (the whole folder) from the link: https://drive.google.com/drive/folders/1WSn-15wGWz6i2_aZeQTwKls2sZ6dpfHf?usp=share_link

Install dependencies

pip install -r run_nnunet_inference_requirements.txt

Run the command:

python run_nnunet_inference.py --path-dataset <PATH TO FOLDER CONTAINING IMAGES, SUFFIXED WITH _0000> --path-out <PATH TO O/P FOLDER> --path-model <PATH TO DOWNLOADED WEIGHTS FOLDER>

Next steps:

[x] Choose the fold whose inference results will be used for the next round of training
[x] Choose and correct ~30 segmentations and include it in the next training round along with the ones that have good segmentation after inference.
[ ] Get metric by running evaluation on the held-out test set (#33)

rohanbanerjee commented 7 months ago

Choose the fold whose inference results will be used for the next round of training

Out of all the folds_{0-5}, the fold_1 has the best performance in terms of dice score on it's respective test set. Given that, I don't think that is the best metric for us to choose the fold (or model). In my opinion, the baseline model should be trained on the maximum number of good ground truth and fold_all is the setting which fits this. The fold_all model is trained on all the training images and does not have a validation set unlike folds_{0-5}.

After going through the QC and based on the above conclusion, I am going ahead with fold_all.

rohanbanerjee commented 7 months ago

I chose 30 subjects from the fold_all inference that I manually corrected. The list of 30 chosen subjects is below: qc_report_fold_all.zip

I manually corrected the segmentation for these subjects and attaching the QC for the same below: qc_fold_all_corrected.zip

After @jcohenadad approves these segmentations, I'll add these images to the training set and start the re-training.

CC: @MerveKaptan