sct-pipeline / fmri-segmentation

Repository for the project on automatic spinal cord segmentation based on fMRI EPI data
MIT License
4 stars 1 forks source link

Potential issues with the ground truth for Zurich Lumbar data #13

Closed rohanbanerjee closed 5 months ago

rohanbanerjee commented 1 year ago

Hello, I am opening this issue so that we can have a discussion around the ground truths we have for the Zurich Lumbar data. This is a rest dataset which has 13 subjects.

Explanation

I have trained a 2D-UNet nnUNet with all the manually labeled dataset with all the manually labeled datasets. When I ran the inference, I observe that most of the images which have a low dice score are from the Zurich Lumbar dataset. I probed into the images and found that the some of the segmentation masks might not be the most optimal in my opinion and I would love to discuss this. For example,

(Yellow = ground truth, Red = prediction mask)

Subject - Zurich Lumbar C01 Dice: 0.5604 Slice 14

Screen Shot 2023-06-20 at 3 44 04 PM Screen Shot 2023-06-20 at 3 44 45 PM Screen Shot 2023-06-20 at 3 46 23 PM

Slice 8

Screen Shot 2023-06-20 at 3 51 53 PM Screen Shot 2023-06-20 at 3 53 11 PM Screen Shot 2023-06-20 at 3 54 05 PM
(The model did not predict a segmentation mask for this slice)

Subject - Zurich Lumbar C12 Dice: 0.7843

Slice 5

Screen Shot 2023-06-20 at 4 03 25 PM Screen Shot 2023-06-20 at 4 04 19 PM Screen Shot 2023-06-20 at 4 04 46 PM Thanks!

MerveKaptan commented 1 year ago

Hi @rohanbanerjee,

I wanted to check these one by one in FSLeyes. However, I just looked at the first one and I am a bit confused- am I looking at the incorrect dataset or are the subject numbers and labels mixed? image This is subject 13 slice 14 (in fsleyes, so starting from 0) for me. I also looked at slice 13 of fsleyes but this still does not seem to be the identical with the slice that you added ?? Please let me know if I am looking at the incorrect dataset/subject/slice etc!

Thank you!!

jcohenadad commented 1 year ago

assuming the data showed in https://github.com/sct-pipeline/fmri-segmentation/issues/13#issue-1766081250 are correct (referring to Merve’s answer above), I agree the ground truth segmentations are questionable

rohanbanerjee commented 1 year ago

Hello @MerveKaptan , sorry for the confusion, the first subject is Zurich Lumbar C01 and not C13. I have edited the above main comment accordingly.

MerveKaptan commented 1 year ago

Dear all,

Thank you so much for the clarification @rohanbanerjee! I notice that CSF/cord contrast is very low- possibly this is because, these are lumbar data, right @jcohenadad ?

For subject C01, slice 14, I actually do not see a problem with the mask, I would draw it quite similarly I assume! What do you think is problematic? image

I think, for slice 8, I would draw the mask a bit wider, please see below (blue: original, red voxels my addition to the original) image

I would personally draw the mask a bit bigger for slice 5 C12 as well.

Would you agree or how would you alter the mask? It seems as if this is almost a bit of a personal taste :) How can we assess this?

kennethaweberii commented 1 year ago

Thanks for raising this issue @rohanbanerjee.

In general, we will need to find a consistent way to approach QC for the manual segmentations. I would tend to accept the ground truth as is versus modifying it. Overall, I do not feel the ground truth for C01 and C12 is too far off and is within reason.

Due to the cauda equina, the spinal cord (and surrounding neural tissue) at the lumbosacral segments can look quite different than the cervical and thoracic segments. The boundaries between the cord and peripheral nerves is often hard to discern given the lower spatial resolution and lower contrast/signal in the functional images. Also, I expect the model will not perform as well on the lumbosacral cord given that we have fewer lumbar spinal cord fMRI datasets (unless we account for this imbalance in the training).

At this time, I think we can table this issue until we have a model trained on the full, more diverse dataset.

Below is an example from the PAM50 template where the peripheral nerves cause a pretty significant change in the appearance of the spinal cord and surrounding neural tissues:

image

image

jcohenadad commented 1 year ago

My suggestion is to revisit the GT for this dataset and not include it in the model for now.

rohanbanerjee commented 5 months ago

We have opted for an active learning strategy #29 and would be using the Zurich Lumbar data in it too. Closing this issue for now.