ivadomed / canproco

Code for preprocessing the CanProCo brain and spinal cord dataset
MIT License
4 stars 1 forks source link

Verifying that HC's lesion masks are empty #96

Open plbenveniste opened 3 months ago

plbenveniste commented 3 months ago

Opening this issue to check if the Healthy Controls' (HC) lesion masks are empty (meaning that there are no segmented lesions on the masks).

This is related to this comment.

Tagging @jcohenadad and @leelisae which were part of the initial discussion.

plbenveniste commented 3 months ago

Spoiler alert: PROBLEMS found !

The code I used to track the HC and verify that their respective masks are empty is the following:


import sys
import os
import subprocess
import nibabel as nib
import warnings

# Print the file name
print("File name:", sys.argv[1])

# Check if filename is provided as argument
if len(sys.argv) < 2:
    print("Usage:", sys.argv[0], "<filename.tsv>")
    sys.exit(1)

# Extract participant IDs with HC pathology
with open(sys.argv[1], 'r') as file:
    participant_ids = [line.split('\t')[0] for line in file if line.split('\t')[3] == "HC"]

# Iterate over participant IDs
for participant_id in participant_ids:
    # print("Participant ID:", participant_id)

    # Find the files ending in lesion-manual.nii.gz in subfolders of derivatives/participant_id
    for root, dirs, files in os.walk(os.path.join(sys.argv[2], participant_id)):
        for file in files:
            if file.endswith("lesion-manual.nii.gz"):
                # print("File:", os.path.join(root, file))
                # Check if the file contains only zeros
                img = nib.load(os.path.join(root, file))
                data = img.get_fdata()
                if data.sum() == 0:
                    print(participant_id, ": Only zeros")
                # Else we raise a warning
                else:

                    warnings.warn(file + ": Non-zero values found in lesion mask", UserWarning,  stacklevel=2)

It is executable using the following command

python get_HC.py ~/Documents/data/canproco/participants.tsv ~/Documents/data/canproco/derivatives/labels

Here is the output:

File name: /Users/plbenveniste/Documents/data/canproco/participants.tsv
sub-cal110 : Only zeros
sub-cal163 : Only zeros
sub-cal186 : Only zeros
sub-cal187 : Only zeros
sub-cal194 : Only zeros
sub-cal195 : Only zeros
sub-cal199 : Only zeros
sub-cal200 : Only zeros
sys:1: UserWarning: sub-cal201_ses-M0_STIR_lesion-manual.nii.gz: Non-zero values found in lesion mask
sub-cal202 : Only zeros
sub-edm123 : Only zeros
sub-edm127 : Only zeros
sub-edm128 : Only zeros
sub-edm129 : Only zeros
sub-edm138 : Only zeros
sub-edm166 : Only zeros
sub-edm168 : Only zeros
sub-edm169 : Only zeros
sub-edm177 : Only zeros
sub-edm178 : Only zeros
sub-edm179 : Only zeros
sub-mon014 : Only zeros
sub-mon014 : Only zeros
sys:1: UserWarning: sub-mon027_ses-M0_PSIR_lesion-manual.nii.gz: Non-zero values found in lesion mask
sub-mon031 : Only zeros
sys:1: UserWarning: sub-mon031_ses-M12_PSIR_lesion-manual.nii.gz: Non-zero values found in lesion mask
sub-mon065 : Only zeros
sys:1: UserWarning: sub-mon096_ses-M0_PSIR_lesion-manual.nii.gz: Non-zero values found in lesion mask
sub-mon105 : Only zeros
sys:1: UserWarning: sub-mon126_ses-M0_PSIR_lesion-manual.nii.gz: Non-zero values found in lesion mask
sys:1: UserWarning: sub-mon155_ses-M0_PSIR_lesion-manual.nii.gz: Non-zero values found in lesion mask
sys:1: UserWarning: sub-mon169_ses-M0_PSIR_lesion-manual.nii.gz: Non-zero values found in lesion mask
sys:1: UserWarning: sub-mon172_ses-M0_PSIR_lesion-manual.nii.gz: Non-zero values found in lesion mask
sub-mon202 : Only zeros
sub-tor038 : Only zeros
sub-tor038 : Only zeros
sys:1: UserWarning: sub-tor039_ses-M0_PSIR_lesion-manual.nii.gz: Non-zero values found in lesion mask
sub-tor039 : Only zeros
sub-tor051 : Only zeros
sub-tor051 : Only zeros
sub-tor052 : Only zeros
sub-tor053 : Only zeros
sub-tor054 : Only zeros
sub-tor055 : Only zeros
sub-tor057 : Only zeros
sub-tor092 : Only zeros
sub-tor148 : Only zeros
sub-van005 : Only zeros
sub-van138 : Only zeros
sub-van139 : Only zeros
sub-van141 : Only zeros
sys:1: UserWarning: sub-van159_ses-M0_PSIR_lesion-manual.nii.gz: Non-zero values found in lesion mask
sys:1: UserWarning: sub-van160_ses-M0_PSIR_lesion-manual.nii.gz: Non-zero values found in lesion mask
sub-van162 : Only zeros
sub-van163 : Only zeros
sub-van164 : Only zeros
sub-van165 : Only zeros

Therefore, we have mistakenly segmented some non-existing lesions in HC controls, which is really problematic. Thanks @leelisae for pointing this out to us.

TODO:

plbenveniste commented 3 months ago

Modifications were done and pushed to branch plb/correct_hc_lesion_mask. PR was created and is ready for review.

Should I send the updated segmentations to Lisa ?

jcohenadad commented 3 months ago

Should I send the updated segmentations to Lisa ?

we should also send them to the UBC team

plbenveniste commented 3 months ago

Here is the last version of the M0 lesion segmentations: canproco_M0_lesion_segmentations.zip

I will work on the merge of their last version of the combined M0 and M12 data (email from May 16th) before sending them the M0 lesion segmentations.

Tagging @leelisae and @aman-s to let them know that they can find the segmentations here in the meantime.

lisaeylee commented 3 months ago

@plbenveniste Awesome! To clarify - I'll remove the lesion segmentations I received from you earlier (Issue #88) and replace this with the lesion segmentations above instead on my end, which I will use for my project to calculate MTR for ROIs excluding lesions. Please let me know if I misunderstood this. Thanks!

lisaeylee commented 3 months ago

@plbenveniste While working on this analysis, I found that there were some lesion masks that I expected but could not find in the zip folder you provided above. Could you please clarify? Thank you!

Missing lesion masks:

Site: Edmonton sub-edm161_ses-M0_PSIR_lesion-manual.nii.gz (CAN-05-RRM-161-M0)

Site: Montreal sub-mon015_ses-M0_PSIR_lesion-manual.nii.gz (CAN-03-CON-015-M0) sub-mon052_ses-M0_PSIR_lesion-manual.nii.gz (CAN-03-RRM-052-M0) sub-mon164_ses-M0_PSIR_lesion-manual.nii.gz (CAN-03-RRM-164-M0)

plbenveniste commented 3 months ago

Hi @lisaeylee !

I checked the files using the following command :

git log --follow -- sub-mon015/ses-M0/anat/sub-mon015_ses-M0_PSIR.nii.gz

I found that the reason these files didn't have a segmentation is because they were added in January with the corrected M0 data. Therefore, we didn't take the time to annotate them. There are no particular reasons for the segmentations masks to be absent.

lisaeylee commented 3 months ago

@plbenveniste - To clarify:

(1) Can I assume that all the M0 SC lesion masks you sent me were created using the corrected M0 data? Last summer, UBC identified that some M0 data were overwritten by M12 data. Around 30-35 data were affected.

(2) If yes, when would the above lesion masks be available? If there is no plan to do this soon, it's no problem. I can run the analysis for the rest.

Thanks!

plbenveniste commented 3 months ago

(1) Yes the lesion masks were created using the corrected M0 data.

(2) On it now. Issue #99

plbenveniste commented 3 months ago

Hi @lisaeylee, here is the new batch of lesion segmentations. Thanks again for your feedback. canproco_M0_lesion_segmentations.zip

Also, as I was saying in this comment some files are recorded in an exclude.yml file because they are too artifacted for us to segment lesions or the spinal cord. This might impact your registration task.

Good luck with the rest of your work 😃

lisaeylee commented 3 months ago

@plbenveniste - Where can I find the exclude.yml file? I assumed that I could use all lesion mask files that are provided.

plbenveniste commented 2 months ago

Hi Lisa ! I am sorry for the late reply. The exclude.yml file can be found in this current repo. The file is here: https://github.com/ivadomed/canproco/blob/main/exclude.yml

leelisae commented 2 months ago

@plbenveniste - No worries - you actually had already answered this in another thread! All good :)