Ventral rootlets - results of second segmentation model

LouisThomasLapointe commented 7 months ago

1) Dataset

First training was done on 20 subjects from the spine-generic and OpenNeuro datasets. Manual segmentation of the ventral rootlets was done while the dorsal rootlets' segmentation were taken from the D5 dataset described here.

2) Model training

nnUNet 3d_fullres model trained on 20 subjects. To initialize the dataset, the following command was used: nnUNetv2_plan_and_preprocess -d 104 --verify_dataset_integrity -c 3d_fullres

For starting the training, the following command was used: CUDA_VISIBLE_DEVICES=0 nnUNetv2_train 104 3d_fullres 0

For running the inference on new images, the following command was used: nnUNetv2_predict -i nnUNet_raw/Dataset104_M1/imagesTs -o nnUNet_results/Dataset104_M1/labels_results -d 104 -c 3d_fullres -f 0 Where the Dataset104_M1/imagesTs folder contains the images on which inference was run.

3) Results

Here are the learning curves for the training.

![progress](https://github.com/ivadomed/model-spinal-rootlets/assets/146556291/31e293ab-2ce3-47b2-b744-6efc0e02dc0a)

This next graph shows the performance of the V1 (Dataset101) and V2 (Dataset104) models on the test subjects. It shows an augmentation in the quality of the segmentations from the V2 model.

The relatively small mean dice score and big standard deviation come from the fact that the V1 and V2 models have a lot of difficulty to correctly label the spinal level on one of the test subject, resulting in a big amount of spinal level mislabelisation (SLM) errors (see the table below).

V1 model performance on test subjects

Image Name | TP | SLM | FP | FN | Dice -- | -- | -- | -- | -- | -- b'sub-007_ses-headNormal_009.nii.gz' | 2398 | 0 | 1454 | 2165 | 0,56993464 b'sub-010_ses-headUp_015.nii.gz' | 3817 | 0 | 535 | 2755 | 0,69882827 b'sub-amu02_215.nii.gz' | 645 | 769 | 813 | 1196 | 0,26669423 b'sub-barcelona01_212.nii.gz' | 1985 | 0 | 209 | 1803 | 0,66365764 b'sub-brnoUhb03_209.nii.gz' | 5390 | 0 | 1381 | 2934 | 0,71414376 Mean | | | | | 0,58265171

V2 model performance on test subjects

Image Name | TP | SLM | FP | FN | Dice -- | -- | -- | -- | -- | -- b'sub-007_ses-headNormal_009.nii.gz' | 2363 | 0 | 1279 | 2200 | 0,57599025 b'sub-010_ses-headUp_015.nii.gz' | 4158 | 0 | 516 | 2414 | 0,73946292 b'sub-amu02_215.nii.gz' | 976 | 310 | 686 | 1324 | 0,42601484 b'sub-barcelona01_212.nii.gz' | 2492 | 0 | 389 | 1296 | 0,74733843 b'sub-brnoUhb03_209.nii.gz' | 5357 | 0 | 1157 | 2967 | 0,72206497 Mean | | | | | 0,64217428

For this specific subject, the curvature of its spine in the lower levels is bigger, resulting in SLM in the lower levels as shown on the image below (The picture on the left is the V1 model, the picture on the right is the V2 model. The green pixels are correctly labeled, yellow pixels are false negatives, red pixels are false positives and blue pixels are SLM).

valosekj commented 7 months ago

Nice progress @LouisThomasLapointe!

It's great to see that the model has improved (i.e., Dice is higher) when more subjects are added!

Could you please run training_scripts/plot_nnunet_training_log.py to generate a figure showing validation pseudo dice for each class (i.e., each rootlets level) to see what levels the model struggle with?

Also, the figure is slightly difficult to follow; could you please generate a GIF (you can toggle overlays), for example, as done here.

valosekj commented 6 months ago

Closing -- see summary: https://github.com/ivadomed/model-spinal-rootlets/issues/42

ivadomed / model-spinal-rootlets