sct-pipeline / fmri-segmentation

Repository for the project on automatic spinal cord segmentation based on fMRI EPI data
MIT License
4 stars 1 forks source link

Try running contrast-agnostic model on the EPI data #23

Closed jcohenadad closed 8 months ago

jcohenadad commented 1 year ago

If results look OK, I suggest re-training the contrast-agnostic model so it can also work for EPI data.

rohanbanerjee commented 1 year ago

I have run the contrast-agnostic model on the EPI data and visually the results look okay. Use the following command for further reference:

python /home/GRAMES.POLYMTL.CA/robana/duke/temp/rohan/fmri_sc_seg/monai/run_inference_single_image.py --path-img {image_path} --chkp-path /home/GRAMES.POLYMTL.CA/robana/duke/temp/muena/contrast-agnostic/final_monai_model/nnunet_nf=32_DS=1_opt=adam_lr=0.001_AdapW_CCrop_bs=2_64x192x320_20230918-2253 --path-out /home/GRAMES.POLYMTL.CA/robana/duke/temp/rohan/fmri_sc_seg/results/monai_results --device cpu"
jcohenadad commented 1 year ago

Please produce a QC report so the team can conveniently look at the predictions (add the GT as well)-- @Nilser3 can help you with that

rohanbanerjee commented 1 year ago

I am attaching the qc reports of both ground truths and predictions of the contrast-agnostic model on the test set of the EPI data.

gt_qc.zip monai_preds_qc.zip

jcohenadad commented 1 year ago

@rohanbanerjee:

rohanbanerjee commented 1 year ago

fmri_sc-seg_qc.zip

Test performance (Dice score and Hausdorff Distance):

Model Dice Score Hausdorff Distance
Contrast-agnostic model 0.84 ± 0.055 3.21 ± 1.67
nnUNet 92.5 ± 0.03 1.73 ± 0.84

Interpreting the QC

Each of the test subjects contains 3 different masks. If the value of the -s flag contains:

  1. SUBJECT_NAME.nii.gz --> means ground truth (drawn by the site)
  2. SUBJECT_NAME_pred.nii.gz --> means predicted mask by the nnUNet model specifically trained for EPI data
  3. SUBJECT_NAME_bin.nii.gz --> means predicted by the contrast-agnostic model

Conclusions regarding the QC (common observations):

All the conclusions do take into account the fact that this dataset is completely out-of-distribution (hasn't seen this type of data during the training) for the contrast-agnostic model and the nnUNet was trained only on the EPI data.

  1. In cases where the spinal cord is clearly visible (i.e. the contrast in between the spinal cord and the CSF is good), the contrast-agnostic model performs better or equal to the nnUNet model.

  2. In cases where it has a not so clear contrast, the nnUNet model performs better than the contrast-agnostic model i.e. the contrast-agnostic model is prone to over/under-segmentation. For example, sub-nwM10:

Screenshot 2023-10-18 at 3 00 14 PM Screenshot 2023-10-18 at 3 01 26 PM Screenshot 2023-10-18 at 3 03 19 PM Screenshot 2023-10-18 at 3 33 11 PM

(order: Image --> ground truth --> nnUNet --> contrast-agnostic)

Next step would be to fine-tune the contrast-agnostic model and a monai model from scratch on EPI data and compare with baselines.

rohanbanerjee commented 1 year ago

Relevant issue: https://github.com/sct-pipeline/contrast-agnostic-softseg-spinalcord/issues/83

rohanbanerjee commented 11 months ago

Updating the issue with what has been tried until now.

Fine-tuning:

Objective

The main objective of the fine-tuning was to use the contrast-agnostic model pre-trained weights to transfer knowledge to a model which is trained to segment spinal cord on EPI data. This is how the newly trained model would be agnostic to EPI data.

It is expected that the fine-tuned model would have a good performance since it would have a lot of "spinal cord" context through its weights and biases.

Path to checkpoint used: duke/temp/muena/contrast-agnostic/final_monai_model/nnunet_nf=32_DS=1_opt=adam_lr=0.001_AdapW_CCrop_bs=2_64x192x320_20230918-2253

This PR https://github.com/sct-pipeline/contrast-agnostic-softseg-spinalcord/pull/85 adds the functionality to initialize a model with pre-trained weights.

Results and Observations

  1. Test Dice Score: 0.47 ± 0.24; Test Hausdorff Distance: 9.69 ± 7.97
  2. Unstable validation soft dice curve: Screenshot 2023-11-08 at 12 53 21 PM

After investigation, the reason of this poor result and unstable training was the crop size. The crop size used for the training was 64x192x320. This is also what the contrast-agnostic model was trained on. BUT, the image size of the EPI data is smaller than the crop size of the contrast-agnostic model. The median crop size of the EPI data (as per nnUNet plans) should be 32x48x128. If this crop size is used then the model throws out an error since there is a size mismatch with the pre-trained model. This is the end feeds noise to the model which ends up in the unstable training.

Conclusion:

  1. Seems like the contrast-agnostic model cannot be trained out-of-the-box because of the size mismatch error.
  2. The workaround can be resize the image by padding.
jcohenadad commented 11 months ago

If you set CCrop_bs=2_64x192x320, then doesn't MONAI do some padding on the input image? If not, then I agree that padding should be tried.

According to @naga-karthik, there is indeed padding, however the poor quality of training is likely caused to the excessive padding because the images are much smaller than the ones used for the original contrast-agnostic model.

rohanbanerjee commented 8 months ago

We have decided to continue with nnUNet for now