sct-pipeline / contrast-agnostic-softseg-spinalcord

Contrast-agnostic spinal cord segmentation project with softseg
MIT License
4 stars 3 forks source link

Towards Contrast-agnostic Soft Segmentation of the Spinal Cord

arXiv

Official repository for contrast-agnostic spinal cord segmentation project using SoftSeg.

This repo contains all the code for data preprocessing, training and running inference on other datasets. The code is mainly based on Spinal Cord Toolbox and MONAI (PyTorch).

CITATION INFO: If you find this work and/or code useful for your research, please cite our paper:

@article{bedard2023towards,
  title={Towards contrast-agnostic soft segmentation of the spinal cord},
  author={B{\'e}dard, Sandrine and Enamundram, Naga Karthik and Tsagkas, Charidimos and Pravat{\`a}, Emanuele and Granziera, Cristina and Smith, Andrew and Weber II, Kenneth Arnold and Cohen-Adad, Julien},
  journal={arXiv preprint arXiv:2310.15402},
  year={2023}
  url={https://arxiv.org/abs/2310.15402}
}

Table of contents

1. Main Dependencies

2. Dataset

The source data can be found at spine-generic multi-subject.

The preprocessed data are located at duke:projects/ivadomed/contrast-agnostic-seg/data_processed_sg_2023-08-08_NO_CROP\data_processed_clean (internal server)

3. Preprocessing

Main preprocessing steps include:

For T1w and T2w:

For T2star:

For DWI:

Next steps are to generate the contrast-agnostic soft segmentation:

The output of this script is a new derivatives/labels_softseg/ folder that contains the soft labels to be used in this contrast-agnostic segmentation project. All the registration were manually QC-ed (see Quality Control) and the problematic registrations were listed in exclude.yml. The processing was run again to generate the soft segmentations.

Specify the path of preprocessed dataset with the flag -path-data.

3.1. Launch Preprocessing

This section assumes that SCT is installed. The installation instructions can be found here.

cd processing_spine_generic
sct_run_batch -jobs -1 -path-data <PATH_DATA> -path-output <PATH-OUTPUT> -script process_data.sh -script-args exclude.yml

or use a config file:

Example config_process_data.json: 
{
  "path_data"   : "~/data_nvme_sebeda/datasets/data-multi-subject/",
  "path_output" : "~/data_nvme_sebeda/data_processed_sg_2023-08-04_NO_CROP",
  "script"      : "process_data.sh",
  "jobs"        : 50,
  "exclude_list": ["sub-brnoUhb02", "sub-brnoUhb03", "sub-brnoUhb07", "sub-brnoUhb08", "sub-brnoUhb08", "sub-brnoUhb08", "sub-ucdavis01", "sub-ucdavis02", "sub-ucdavis03", "sub-ucdavis04", "sub-ucdavis05", "sub-ucdavis06", "sub-ucdavis07", "sub-beijingVerio01", "sub-beijingVerio02", "sub-beijingVerio03", "sub-beijingVerio04", "sub-beijingGE01", "sub-beijingGE02", "sub-beijingGE03", "sub-beijingGE04", "sub-ubc01", "sub-oxfordOhba02"]
}
sct_run_batch -config config_process_data.json

A process_data_clean folder is created in where the cropped data and derivatives are included. Here, only the images that have a manual segmentation and soft segmentation are transfered.

3.2. Quality control

After running the analysis, check your Quality Control (QC) report by opening the file /qc/index.html. Use the "search" feature of the QC report to quickly jump to segmentations or labeling issues.

1. Segmentations

If segmentation issues are noticed while checking the quality report, proceed to manual correction using the procedure below:

Proceed to manual correction using FSLeyes or ITK snap. Upload the manual segmentations (_seg-manual.nii.gz) with json sidecar in the derivatives. Re-run the analysis: Launch processing

2. Registrations

3. Soft segmentations

4. Training

4.1. Setting up the environment

The following commands show how to set up the environment. Note that the documentation assumes that the user has conda installed on their system. Instructions on installing conda can be found here.

  1. Create a conda environment with the following command:

    conda create -n venv_monai python=3.9
  2. Activate the environment with the following command:

    conda activate ven_monai
  3. Install the required packages with the following command:

    pip install -r monai/requirements.txt

4.2. Datalist creation

The training script expects a datalist file in the Medical Decathlon format containing image-label pairs. The datalist can be created by running the create_msd_data.py script. For example, creating the datalist for the soft_all model:

python monai/create_msd_data.py -pd ~/duke/projects/ivadomed/contrast-agnostic-seg/data_processed_sg_2023-08-08_NO_CROP\data_processed_clean> -po ~/datasets/contrast-agnostic/ --contrast all --label-type soft --seed 42

The dataset split containing the training, validation, and test subjects can be found in the monai/data_split_all_soft_seed15.yaml file.

Note The output of the above command is just .json file pointing to the image-label pairs in the original BIDS dataset. It does not copy the existing data to the output folder.

4.3. Training

The training uses MONAI functions and is written in PyTorch Lightning. Example training command to run the soft_all model:

python monai/main.py -m nnunet -crop 64x192x320 --contrast all --label-type soft -initf 32 -me 200 -bs 2 -opt adam -lr 1e-3 -cve 5 -pat 20 -epb -stp --enable_DS

Example training command to run the soft_per_contrast model on the dwi contrast:

python monai/main.py -m nnunet -crop 64x192x320 --contrast dwi --label-type soft -initf 32 -me 3 -bs 2 -opt adam -lr 1e-3 -cve 5 -pat 20 -epb -stp --enable_DS

These commands assume that the datalist created in Section 4.2 lies in the same folder as monai/main.py. Run python monai/main.py -h to see all the available arguments and their descriptions.

Note WandB is used experiment tracking and is implemented via Lightning's Wandblogger. Make sure that the project and entity are changed to the appropriate values.

4.4. Running inference

Inference can be run on single images using the monai/run_inference_single_image.py script. Run monai/run_inference_single_image.py -h for usage instructions. Both CPU and GPU-based inference are supported.

4.5. Compute ANIMA metrics

To compute the ANIMA metrics shown in the paper, the scripts compute_anima_metrics_*.py are used. For generating the metrics on the spine-generic dataset and also for deepseg and propseg methods, use the following command:

python anima_metrics/compute_anima_metrics_spine_generic.py --pred-folder <PATH_PREDS> --method <monai/deepseg2d/deepseg3d/propseg> -dname spine-generic

For reproducing the results on the other datasets, use the following command:

python anima_metrics/compute_anima_metrics_unseen_datasets.py --pred-folder <PATH_PREDS> -dname <sci-t2w/ms-mp2rage/radiculopathy-epi>

Note The --pred-folder argument expects the path to the folder containing the prediction and GT segmentation masks.

5. Computing morphometric measures (CSA)

To compute the CSA at C2-C3 vertebral levels on the prediction masks and get the QC report of the predictions, the script compute_csa_qc_<nnunet/monai>.sh are used. The input is the folder data_processed_clean (result from preprocessing) and the path of the prediction masks is added as an extra script argument -script-args.

For every trained model, you can run:

sct_run_batch -jobs -1 -path-data /data_processed_clean/ -path-output <PATH_OUTPUT> -script compute_csa_qc_<nnunet/monai>.sh -script-args <PATH_PRED_MASKS>

The CSA results will be under <PATH_OUTPUT>/results and the QC report under <PATH_OUTPUT>/qc.

5.1. Using contrast-agnostic model (best)

Here is an example on how to compute CSA and QC on contrast-agnostic model

sct_run_batch -jobs -1 -path-data ~/duke/projects/ivadomed/contrast-agnostic-seg/data_processed_sg_2023-03-10_NO_CROP\data_processed_clean -path-output ~/results -script compute_csa_qc_monai.sh -script-args ~/duke/projects/ivadomed/contrast-agnostic-seg/models/monai/spine-generic-results

5.2. Using nnUNet model

Note: For nnUnet, change the variable prefix in the script compute_csa_nnunet.sh according to the prefix in the prediction name. Here is an example on how to compute CSA and QC on nnUNet models.

sct_run_batch -jobs -1 -path-data ~/duke/projects/ivadomed/contrast-agnostic-seg/data_processed_sg_2023-03-10_NO_CROP\data_processed_clean -path-output ~/results -script compute_csa_qc_nnunet.sh -script-args ~/duke/projects/ivadomed/contrast-agnostic-seg/models/nnunet/spine-generic-results/test_predictions_2023-08-24

6. Analyse CSA and QC reports

To generate violin plots and analyse results, put all CSA results file in the same folder (here csa_ivadomed_vs_nnunet_vs_monai) and run:

python analyse_csa_all_models.py -i-folder ~/duke/projects/ivadomed/contrast-agnostic-seg/csa_measures_pred/csa_ivadomed_vs_nnunet_vs_monai/ \
                                 -include csa_monai_nnunet_2023-09-18 csa_monai_nnunet_per_contrast csa_gt_2023-08-08 csa_gt_hard_2023-08-08 \
                                          csa_nnunet_2023-08-24 csa_other_methods_2023-09-21-all csa_monai_nnunet_2023-09-18_hard csa_monai_nnunet_diceL

The plots will be saved to the parent directory with the name charts_<datetime.now())>

7. Get QC reports for other datasets

The QC reports from three other datasets sci-t2w, ms-mp2rage, and radiculopathy-epi are shown in the paper. The scripts for reproducing the results are in the qc_other_datasets folder.

General command to run QC on prediction masks from other datasets:

sct_run_batch -path-data <PATH_DATA> -path-out <PATH-OUT> -script-args <PATH_PRED_MASK> -jobs 20 -script run_qc_prediction_<dataset>.sh

7.1. Running QC on predictions from SCI-T2w dataset

Using the contrast-agnostic model:

sct_run_batch -jobs 32 -path-data ~/path-to-dataset/sci-colorado/ \
                       -path-output ~/path-to-output/qc_contrast-agnostic_sci-colorado \
                       -script run_qc_prediction_sci_colorado.sh \
                       -script-args ~/duke/projects/ivadomed/contrast-agnostic-seg/models/monai/sci-colorado-results/test_preds_colorado_soft_all

Using the nnUNet model:

sct_run_batch -jobs 32 -path-data ~/path-to-dataset/sci-colorado/ \
                       -path-output ~/path-to-output/qc_nnunet_sci-colorado \
                       -script run_qc_prediction_sci_colorado.sh \
                       -script-args "/home/GRAMES.POLYMTL.CA/u114716/duke/projects/ivadomed/contrast-agnostic-seg/models/nnunet/sci-colorado-results/test_predictions nnUNet"

7.2. Running QC on predictions from MS-MP2RAGE dataset

Using the contrast-agnostic model:

sct_run_batch -jobs 32 -path-data ~/path-to-dataset/basel-mp2rage/ \
                       -path-output ~/path-to-output/qc_contrast-agnostic_basel-mp2rage \
                       -script run_qc_prediction_basel_mp2rage.sh \
                       -script-args ~/duke/projects/ivadomed/contrast-agnostic-seg/models/monai/basel-mp2rage-rpi-results/test_preds_mp2rage_soft_all

Using the nnUNet model:

sct_run_batch -jobs 32 -path-data ~/path-to-dataset/basel-mp2rage/ \
                       -path-output ~/path-to-output/qc_nnunet_basel-mp2rage \
                       -script run_qc_prediction_basel_mp2rage.sh \
                       -script-args "/home/GRAMES.POLYMTL.CA/u114716/duke/projects/ivadomed/contrast-agnostic-seg/models/nnunet/basel-mp2rage-rpi-results/test_predictions nnUNet"

7.3. Running QC on predictions from Radiculopathy-EPI dataset

Using the contrast-agnostic model:

sct_run_batch -jobs 32 -path-data ~/path-to-dataset/epi-stanford/ \
                       -path-output ~/path-to-output/qc_contrast-agnostic_epi-stanford \
                       -script run_qc_prediction_epi_stanford.sh \
                       -script-args ~/duke/projects/ivadomed/contrast-agnostic-seg/models/monai/epi-stanford-results/test_preds_soft_all

Using the nnUNet model:

sct_run_batch -jobs 32 -path-data ~/path-to-dataset/epi-stanford/ \
                       -path-output ~/path-to-output/qc_nnunet_radiculopathy-epi \
                       -script run_qc_prediction_epi_stanford.sh \
                       -script-args "/home/GRAMES.POLYMTL.CA/u114716/duke/projects/ivadomed/contrast-agnostic-seg/models/nnunet/epi-stanford-results/test_preds_stanford_rest_weber nnUNet"

8. Active learning procedure (TODO)

Note This section is still a work in progress.

To extend the training set to other contrasts and to pathologies, we applied the segmentation model to other datasets, manually corrected the segmentations and added them to the training set.

Here is the detailed procedure:

  1. Run inference on other datasets for the selected models and generate the QC report from prediction masks.
  2. Select ~20 interesting images per dataset (using the QC report).
  3. Correct the inference on the selected subjects if needed (you can use manual-correction script).
  4. Add the inferred segmentations to the derivatives/labels_contrast_agnostic folder of each dataset.
  5. Add inferred segmentations to the training set (keep the same testing spine generic subjects) & retrain a model.
  6. Compute CSA on spine generic testing set and see STD vs before