spinalcordtoolbox / manual-correction

Scripts for the manual correction of spinal cord labels
MIT License
4 stars 0 forks source link

Missing files error when using `*` wildcard #35

Closed valosekj closed 1 year ago

valosekj commented 1 year ago

When using the following yml config file:

FILES_LESION:
 - sub-*_acq-ax_T2w.nii.gz

manual_correction.py script does not interpret * as a wildcard and prints an error about missing files:

(venv) valosek@macbook-pro:~/code/manual-correction$ python manual_correction.py -config ~/data/data.neuro.polymtl.ca/dcm-zurich-lesions/compression_labels.yml -path-in ~/data/data.neuro.polymtl.ca/dcm-zurich-lesions
The following files are missing:
['/Users/valosek/data/data.neuro.polymtl.ca/dcm-zurich-lesions/sub-*/anat/sub-*_acq-ax_T2w.nii.gz']

Please check that the files listed in the yaml file and the input path are correct.

The following label files are missing:
['/Users/valosek/data/data.neuro.polymtl.ca/dcm-zurich-lesions/sub-*/anat/sub-*_acq-ax_T2w_lesion.nii.gz']

Please check that the used suffix '_lesion' is correct. If not, you can provide custom suffix using '-suffix-files-' flags.

But still, the script then continues.

NathanMolinier commented 1 year ago

I just had an issue while using wildcards with the following YAML file:

FILES_LABEL:
- sub-*.nii.gz

The script is looking for a centerline file for no reason:

Screenshot 2023-08-18 at 2 58 52 PM

Do you have any ideas why @valosekj ?

valosekj commented 1 year ago

Weird! Can you please debug the following lines:

https://github.com/spinalcordtoolbox/manual-correction/blob/13cd3f4e599161aa1c5866d66c60a25a0f3c006a/manual_correction.py#L682-L687

NathanMolinier commented 1 year ago

I just did and I have that:

subject=''
ses=''
filename='sub-*.nii.gz'

And files

Screenshot 2023-08-18 at 3 14 31 PM
NathanMolinier commented 1 year ago

This line need to be fixed

files = sorted(glob.glob(os.path.join(path_img, '**', filename),
 recursive=True))
valosekj commented 1 year ago

Good catch! You are right, glob.glob '**' returns all the files. But the files should be filtered based on the key (e.g., FILES_LABELS). We can use filtering based on suffix_dict[task], as done here:

https://github.com/spinalcordtoolbox/manual-correction/blob/13cd3f4e599161aa1c5866d66c60a25a0f3c006a/manual_correction.py#L711

NathanMolinier commented 1 year ago

Not really, because the variable files should only contain path to images, not labels.

Moreover, it not clear to me how were these paths created since no _centerline files are available in the whole-spine dataset.