neuropoly / data-management

Repo that deals with datalad aspects for internal use
4 stars 0 forks source link

Canproco SC segmentation were saved with a wrong header #305

Closed NathanMolinier closed 5 months ago

NathanMolinier commented 6 months ago

Description

I tried to work with T2w spinal cord (SC) segmentations stored in the canproco dataset and I noticed that images from Calgary were saved with a wrong header. Leading to a wrong display and more troubles.

drawing
In the RPI system: Header Image SC segmentation
X (pixel) 56 56
Y (pixel) 512 320
Z (pixel) 512 320
px (mm/pixel) 0.800011 0.800011
py (mm/pixel) 0.5 0.5
pz (mm/pixel) 0.5 0.5

The real resolution for the SC segmentation should be:

Solution

To fix this I wrote a script to overwrite the header. Also, I would like to save the SC segmentations in the same space as the one used for images to simplify future use.

Related issues

Note provided in the README

Note: T2w sagittal images were preprocessed before the spinal cord was segmented ('_T2w_seg-manual.nii.gz') and disc labels were identified ('_T2w_labels-manual.nii.gz'). Namely, reorientation to RPI and resampling to 0.8mm isotropic voxel were performed. This is why the original T2w images have different dimensions than segmentations and labels. Preprocessing steps: https://github.com/ivadomed/canproco/blob/8e1b2c35f96eeeb3838b512dd93eba25e5a5e97a/scripts-t2w_csa/sct-preprocess_data.sh#L162-L170 For context, see https://github.com/neuropoly/data-management/issues/197

I tried the preprocessing steps but it did not work

valosekj commented 6 months ago

Good catch @NathanMolinier! Thanks for reporting this! I wonder how that could happen!

Can you please list the problematic subejcts? I looked at the first one (sub-cal056) and dimensions are fine:

# Raw T2w image: 0.8, 0.5, 0.5
canproco$ sct_image -i sub-cal056/ses-M0/anat/sub-cal056_ses-M0_T2w.nii.gz -header | grep pixdim
pixdim      [1.0, 0.8, 0.5, 0.5, 2.51667, 0.0, 0.0, 0.0]

# SC seg obtained from resampled and reoriented image: 0.800011, 0.8, 0.8
canproco$ sct_image -i derivatives/labels/sub-cal056/ses-M0/anat/sub-cal056_ses-M0_T2w_seg-manual.nii.gz -header | grep pixdim
pixdim      [-1.0, 0.800011, 0.8, 0.8, 1.0, 1.0, 1.0, 1.0]
NathanMolinier commented 6 months ago

I double checked the subject sub-cal056 with a fresh git-annex download and these are the images that I get.

drawing

So it looks like you are using a modified (or older) version of the canproco dataset. Because the problem that I showed is the same for every subject from calgary (sub-calXXX).

To avoid future mistakes I think it would be preferable to push my modified images to git-annex. Also, to simplify the future use of the data, I added these lines to my script to put the SC segmentations in the same space as the one used for images. Is it ok ?

NathanMolinier commented 6 months ago

After more investigation I found this commit: Screenshot 2024-03-26 at 10 14 31

I checked out to the previous commit and downloaded the same images from sub-cal056:

Screenshot 2024-03-26 at 10 33 55

This commit is indeed the problem.

Did something unexpected could have happened @plbenveniste ?

plbenveniste commented 6 months ago

Thanks for highlighting this ! Great catch ! This is the script used : dataset_correction.py. I wonder what could have happened. Maybe, we should take a look at the other datasets as well then ...

jcohenadad commented 6 months ago

This is the script used : dataset_correction.py.

It would be useful to see the terminal log (ie stdout) of the script applied to the dataset, so we can investigate what differed before applying the script, and what changed (or was supposed to change).

valosekj commented 6 months ago

It seems that the dataset_correction.py script iterated not only through PSIR/STIR images (used by Pierre-Louis for his trainings) but also through T2w images due to line 22. Then since the T2w SC segs had different orientation and resolution than the T2w images (because reorientation and resampling were applied before SC seg), the script incorrectly copied the T2w header to T2w SC seg (line 64).

I believe, we can recover T2w SC segs by going back in the git history.

BTW. I would also be very careful about other contrasts and datasets since the script could have also modified other images in an unpredictable way.

NathanMolinier commented 5 months ago

I fixed the SC segmentation headers, resampled them and reoriented all the images to RPI using this script. I also QCed the final segmentations to make sure that everything was OK.

Now, I just need to push the data but I have to ask for the access first.

Notes: I noticed that the subject sub-cal088 had a really bad SC segmentation probably due to the artifacts present in the image, however I still updated the header. I also noticed that the subject sub-064 had a T2w image but no SC segmentations even if the image looks good.

jcohenadad commented 5 months ago

Few comments:

plbenveniste commented 5 months ago

Closing this issue as corrections were applied, and corrected data was merged in the main branch. Thanks everybody for your feedback!