sct-pipeline / contrast-agnostic-softseg-spinalcord

Contrast-agnostic spinal cord segmentation project with softseg
MIT License
4 stars 3 forks source link

One outlier subject in the softseg CSA of T1w contrast #66

Closed naga-karthik closed 1 year ago

naga-karthik commented 1 year ago

I plotted the CSA of the softseg GTs from the latest preprocessed data found under ~/duke/projects/ivadomed/contrast-agnostic-seg/data_processed_sg_2023-08-08_NO_CROP/results. The result is shown below - Note that there is one outlier subject in the plot which stretches the violin plot.

per contrast GT CSA Screen Shot 2023-08-11 at 11 13 28 AM

I looked at the .csv file csa_soft_GT_T1w for this contrast and the subject turns out to be sub-oxfordFmrib04_T1w. Below you can find how the softseg looks. It is indeed not complete and hence resulting in the outlier in the violin plot

outlier subject fsleyes Screen Shot 2023-08-11 at 11 10 55 AM

The most surprising part is the model learned this outlier and then predicted something similar in terms of the CSA. Below you can find the CSA of the model prediction

model prediction outlier Screen Shot 2023-08-11 at 11 19 07 AM

I think this subject needs to be fixed. I showed this to @valosekj and we concluded that we should run the preprocessing again and do a thorough QC of the softsegs so that we eliminate any biases that any model will learn. tagging @sandrinebedard

sandrinebedard commented 1 year ago

From my investigations, this is beacause of the bad T1w registration to T2w.

anim

We did not notice this since the old discs labels were also not proprely warped to the T1w space and thus did not include this half slice in the CSA computation... The QC is over 1602 images to check, sorry if I missed that one :(

Next steps:

naga-karthik commented 1 year ago

Got it, thanks for the quick response! So, does this mean that we only remove the sub-oxfordFmrib04 and the preprocessed data is good? OR, do we have to run the preprocessing again? (in any case, I would have to re-train the model without this subject in the dataset)

The QC is over 1602 images to check, sorry if I missed that one :(

omg! you don't have to do it alone! don't hesitate to either tag me and/or Jan to split the task of QC checking! :)

sandrinebedard commented 1 year ago

we don't need to re-run preprocessing, we can just delete this subject so you can retrain quickly! And I will pass the QC again next week (we can discuss it at tuesday's meeting too) to make sure this was the only one, sounds good?

naga-karthik commented 1 year ago

Yes, sounds good! I'll re-train over the weekend.

Another way to do a robust QC would be also sort the Mean(Area) column in ascending order in the soft GT .csv files in the results folder of the preprocessed dataset and look for any subjects with abnormally high or low CSA.

Jan and I quickly checked it and there was only sub-oxfordFmrib04 with ~33 mm^2 but the others were okay. It would be better if you could take a look too!

sandrinebedard commented 1 year ago

great thanks! I'll take a look too!

naga-karthik commented 1 year ago

@sandrinebedard any updates on this? If you confirm that the QC for the latest version of the dataset is good then we can close this issue!

sandrinebedard commented 1 year ago

Now the subject sub-oxfordFmrib04 is in the exclude list: https://github.com/sct-pipeline/contrast-agnostic-softseg-spinalcord/commit/bec11205428e0ab7c82f1715a6b7fbdcd9067eca