ivadomed / model_seg_ms_mp2rage

Model repository for MS lesion segmentation on MP2RAGE data from University of Basel
MIT License
2 stars 0 forks source link

`ivadomed` configs for training spinal cord segmentation #19

Closed uzaymacar closed 2 years ago

uzaymacar commented 2 years ago

This PR adds the ivadomed config file train_scseg.json and test_scseg.json seg_sc.json (only kept one config file as per this comment) for training and testing a UNet3D model for spinal cord (SC) segmentation.

In training, we will utilize the newly implemented augmentations in ivadomed (as described in #16) just as in #11.

To Reproduce

0. Make sure you have the following dependencies:

1. Clone the repository:

git clone https://github.com/ivadomed/model_seg_ms_mp2rage.git
cd model_seg_ms_mp2rage

2. Clone the data:

git clone git@data.neuro.polymtl.ca:datasets/basel-mp2rage
cd basel-mp2rage
git annex get .
cd ..

3. Preprocess the data:

sct_run_batch -script preprocessing/preprocess_data.sh -path-data basel-mp2rage -path-output basel-mp2rage-preprocessed -jobs <JOBS>

Optional: Run quality-control (QC) on preprocessed data:

python preprocessing/qc_preprocess.py -s basel-mp2rage-preprocessed

4. Run inference on the models (see below for a longer description of models):

git checkout um/ivadomed_scseg_training
cp -r ~/duke/temp/uzay/saved_models_basel/seg_sc_output .

# Training was carried out with: ivadomed --train -c config/seg_sc.json

# Test
ivadomed --test -c config/seg_sc.json

# Segment entire dataset
ivadomed --segment -c config/seg_sc.json
uzaymacar commented 2 years ago

Yeah agreed. I only did two JSON files so that it is parallel to lesion segmentation config files where we need two JSON files as the "target_suffix" key is changed during training and testing. The alternative would be to manually change this key before training and testing but that is not so clean.

I can fix this branch so that there is only one config file, but then the question is: is it OK for SC segmentation to have one config file whereas the lesion segmentation has two config files. It might be difficult to interpret for someone else coming to the repo. Perhaps adding documentation on why we have two config files for that case will solve this issue?

jcohenadad commented 2 years ago

whereas the lesion segmentation has two config files.

i would also only keep one config file on the other repos

uzaymacar commented 2 years ago

Sorry coming back to this again @jcohenadad, I am not sure how I can keep one config file for lesion segmentation. Just to be clear, the files we are talking about are train_randomrater.json, test_randomrater_on_rater1.json, and test_randomrater_on_rater2.json. Only thing that comes to mind is to keep the training config (renaming it accordingly), and add a line or two on README that describes how to modify the "target_suffix" for testing.

jcohenadad commented 2 years ago

Sorry coming back to this again @jcohenadad, I am not sure how I can keep one config file for lesion segmentation. Just to be clear, the files we are talking about are train_randomrater.json, test_randomrater_on_rater1.json, and test_randomrater_on_rater2.json. Only thing that comes to mind is to keep the training config (renaming it accordingly), and add a line or two on README that describes how to modify the "target_suffix" for testing.

Ah, I see, I thought we were talking about other 'two files' (train and test). In any case, if there is only one or two keywords to change, then yes, i would suggest to change it on the syntax rather than on the JSON.

Right now, we are prototyping, so we have more flexibility in terms of version tracking the JSOn files. As long as everything is properly documented for your experiments (so we can come back to them in the future), you can use the syntax.

The JSON will be really useful once the model is put into production (ie: we can use a one liner to reproduce the training)

uzaymacar commented 2 years ago

As shown in the progress report (see the Spinal Cord Segmentation section) for last week models trained with the current config file had a discrepancy between training and test scores. We identified this is most likely due to the usage of a center-crop of (128, 128, 128). This value was indeed not well advised. The latest commit 8d4ca8f modifies the preprocessing QC script s.t. we log sizes of the original images.

When we run this latest QC, we see that all 30 subject images have a size of (240, 256, 176) and we will be updating the center-crop parameter accordingly.

uzaymacar commented 2 years ago

The latest commit cb8a25c:

Note that the center crop size is larger than the size of the original images (240, 256, 176) as reported above. This is because ivadomed requires input shape of each dimension to be a *multiple of length plus 2 padding and a multiple of 16** where the length in our case is 128 per axis as given by the length_3D parameter under Modified3DUNet.

jcohenadad commented 2 years ago

This is because ivadomed requires input shape of each dimension to be a multiple of length plus 2 * padding and a multiple of 16 where the length in our case is 128 per axis as given by the length_3D parameter under Modified3DUNet.

Hum, is that a burden? Should we consider alleviating this limitation?

uzaymacar commented 2 years ago

Hum, is that a burden? Should we consider alleviating this limitation?

I don't think its a problem for this project, but I think at the very least it is an issue-worthy topic! I have opened one in ivadomed: https://github.com/ivadomed/ivadomed/issues/1022.

uzaymacar commented 2 years ago

The latest commit 32a1108:

This config / latest model achieves a test Dice score of 0.9473. More info can be found in this week's progress report. Additionally, the main comment of this PR is updated with steps to use / reproduce the model.

uzaymacar commented 2 years ago

As per the discussion in the meeting, we might want to check whether (256, 256, 256) input shapes have any limitations in terms of RAM etc. in inference.

uzaymacar commented 2 years ago

The latest commit c8359b1 changes slice axis to sagittal as discussed in #26. Note that the center-crop, length, and stride size params stay the same here as it is (256, 256, 256), i.e. same size in all axes. It also reduces the number of epochs from 500 to 200 because we see a faster convergence in spinal cord segmentation compared to lesion segmentation.

Subsequently, the models mentioned here are updated and new results are consistent with the previous results. The latest model achieves a test Dice score of 0.9511.