bids-standard / legacy-validator

Validator for the Brain Imaging Data Structure
https://bids-standard.github.io/bids-validator/
MIT License
185 stars 111 forks source link

Error on physio data in anat sub-directory #2164

Closed tstoeter closed 2 weeks ago

tstoeter commented 2 weeks ago

While conducting our multi-modal MRI study, also a physiological baseline (cardiac and respiratory) was acquired parallel to the anatomical measurements. According to the BIDS specifications on physiological recordings (https://bids-specification.readthedocs.io/en/latest/modality-specific-files/physiological-recordings.html), the physio data files (.tsv.gz and .json) may be stored in the subject's anatomical recordings sub-directory. We have T1w and PDw anatomical recordings and corresponding physio data only for the T1w recording. The subject directories currently look as follows:

...
./physio.json
./sub-01/
./sub-01/anat/
./sub-01/anat/sub-01_PDw.nii.gz
./sub-01/anat/sub-01_T1w.nii.gz
./sub-01/anat/sub-01_T1w.json
./sub-01/anat/sub-01_PDw.json
./sub-01/anat/sub-01_physio.tsv.gz
...

The specification's naming convention for physio data within modality sub-directories is:

<matches>[_recording-<label>]_physio.tsv.gz
<matches>[_recording-<label>]_physio.json

Therefore, in the example above, the part <matches> is sub-01, [_recording-<label>] is empty (we only have one physio recording), and the modality suffix is _physio. Hence, we have sub-01_physio.tsv.gz as the filename. Now, we are facing two problems.

  1. With this naming scheme we cannot relate the physio data to the T1w recording from the filename. Is there an alternative way to capture this relationship?
  2. The bids-validator (bids-validator@1.14.14 and also 1.14.15-dev) complains with an error about this naming scheme.
1: [ERR] Files with such naming scheme are not part of BIDS specification. ... (code: 1 - NOT_INCLUDED)
        ./sub-01/anat/sub-01_physio.tsv.gz
            Evidence: sub-01_physio.tsv.gz

I had a look at the source code bids-validator/bids_validator/rules/file_level_rules.json. For EEG, *_physio.tsv.gz seems to be covered:

      "@@@_cont_ext_@@@": [
        "_physio\\.tsv\\.gz",
        "_stim\\.tsv\\.gz",
        "_physio\\.json",
        "_stim\\.json"
      ]

However, there is no such block for "anat_nonparametric" (where we have T1w and PDw suffixes) in this rules file. Is this the issue for why the bids-validator complains? I'm happy to provide a PR, if this is the correct place to fix it.

rwblair commented 2 weeks ago

Looks like this is an issue in both the node and deno versions of the validator.

The node version validates filenames using the regex in the bids-validator/bids_validator/rules/ as you showed. I went ahead and made a PR for it. #2165

The deno version validates file names using the schema in the bids-specification itself. I'm still working on a PR for the specification to fix this.

effigies commented 2 weeks ago

It looks like https://github.com/bids-standard/bids-specification/pull/513 explicitly permitted it for anat, but the naming convention was not clarified for datatypes that permit unrelated scans that are distinguished only by suffix (which I think is only anat...).

I guess I would say you could use acq-T1w_physio.nii.gz to indicate that it corresponds to a T1w scan. The mod- entity would seem to be applicable here, but that would require an addition to the specification to actually say that it can be used.

tstoeter commented 2 weeks ago

Thank you very much @rwblair and @effigies for fixing our issues so quickly! I tested the latest version with file names sub-01_physio.tsv.gz, as well as sub-01_acq-T1w_physio.tsv.gz. In both cases, the validator passed without error. We will stick with acq-T1w for making the relation clear. Thank you @effigies for pointing it out!