bids-standard / bids-examples

A set of BIDS compatible datasets with empty raw data files that can be used for writing lightweight software tests.
http://bids-standard.github.io/bids-examples/
169 stars 134 forks source link

New dataset "dwi_deriv" for scanner-generated DWI derivatives #452

Open Lestropie opened 3 months ago

Lestropie commented 3 months ago

Relates to:

My current plan is to supersede https://github.com/bids-standard/bids-specification/pull/1831 by extending it to include missing DWI sequence derivatives and resolve https://github.com/bids-standard/bids-specification/issues/1862. Will need to wait to see how that discussion pans out as to whether any of these data need to be changed.

Lestropie commented 1 month ago

I'm getting warnings & errors attempting to run the validator locally on this proposed dataset utilising the specification version in https://github.com/bids-standard/bids-specification/pull/1864. I'm not so familiar with the ecosystem so might need some help.

The BIDSVersion field of 'dataset_description.json' does not match a known release. The BIDS Schema used for validation may be out of date.

If the dataset is proposed for compatibility with a future update to the BIDS specification, and that has been hard-coded into dataset_description.json, should this warning be ignored? Or is there some other accepted practise?

Multiple filename rules were found as potential matches. All of them had at least one issue during filename validation. (ALL_FILENAME_RULES_HAVE_ISSUES) ./sub-01/dwi/sub-01_S0map.json Evidence: Rules that matched with issues: rules.files.common.tables.phenotype, rules.files.raw.anat.parametric and: The datatype directory does not match datatype of found suffix and extension (DATATYPE_MISMATCH) and: Extension used by file does not match allowed extensions for its suffix (EXTENSION_MISMATCH) ./sub-01/dwi/sub-01_expADC.nii Evidence: Rule: rules.files.common.tables.phenotype

"phenotype" seems to be a catch-all rule that will therefore be flagged in any such error. It seems to be missing the fact that the S0map suffix can match rules.files.raw.dwi.ScannerDerivatives, and that's causing multiple errors. The issue is not caused by the use of camel case. Is there perhaps somewhere that rules.files.raw.dwi.ScannerDerivatives needed to be explicitly added to a list of rules but was erroneously omitted from https://github.com/bids-standard/bids-specification/pull/1725?

'SliceTiming' is defined for this file. Neither 'DelayTime' nor 'AcquisitionDuration' may be defined in addition. (SLICE_TIMING_AND_DURATION_MUTUALLY_EXCLUSIVE)

This seems unusual to me. For any 4D temporal dataset (eg. fMRI, DWI), SliceTiming may be wanted for timing / motion correction, but AcquisitionDuration may itself also be of interest. I don't see a good reason for a blanket mutex here. It seems to have come in https://github.com/bids-standard/bids-specification/pull/1410/commits/f0e1b1c13728aa7c0e35310f90b753a044018044 as part of https://github.com/bids-standard/bids-specification/pull/1410. One thing I note is that the prior code was explicit about applying to mri/func, whereas with the new code it's implicit by residing in the func.yaml file; don't know if that has anything to do with a rule intended for fMRI being applied here? Or maybe it's deliberately applied to DWI as well, but is perhaps then defined in the wrong file?

EDIT: I should add that these fields were both present in unmodified dcm2niix outputs. So the mutex violation isn't due to a manual modification on my part.

effigies commented 1 month ago

The BIDSVersion field of 'dataset_description.json' does not match a known release. The BIDS Schema used for validation may be out of date.

If the dataset is proposed for compatibility with a future update to the BIDS specification, and that has been hard-coded into dataset_description.json, should this warning be ignored? Or is there some other accepted practise?

This is a bug. I've not been digging into it because it's a warning instead of an error, but it is misleading. I'll try to get to it soon.

Multiple filename rules were found as potential matches. All of them had at least one issue during filename validation. (ALL_FILENAME_RULES_HAVE_ISSUES) ./sub-01/dwi/sub-01_S0map.json Evidence: Rules that matched with issues: rules.files.common.tables.phenotype, rules.files.raw.anat.parametric and: The datatype directory does not match datatype of found suffix and extension (DATATYPE_MISMATCH) and: Extension used by file does not match allowed extensions for its suffix (EXTENSION_MISMATCH) ./sub-01/dwi/sub-01_expADC.nii Evidence: Rule: rules.files.common.tables.phenotype

"phenotype" seems to be a catch-all rule that will therefore be flagged in any such error. It seems to be missing the fact that the S0map suffix can match rules.files.raw.dwi.ScannerDerivatives, and that's causing multiple errors. The issue is not caused by the use of camel case. Is there perhaps somewhere that rules.files.raw.dwi.ScannerDerivatives needed to be explicitly added to a list of rules but was erroneously omitted from bids-standard/bids-specification#1725?

This should be fixed by https://github.com/bids-standard/bids-validator/pull/2079. Can you try the latest (pre-release, so let me know if you need instructions) version?

'SliceTiming' is defined for this file. Neither 'DelayTime' nor 'AcquisitionDuration' may be defined in addition. (SLICE_TIMING_AND_DURATION_MUTUALLY_EXCLUSIVE)

This seems unusual to me. For any 4D temporal dataset (eg. fMRI, DWI), SliceTiming may be wanted for timing / motion correction, but AcquisitionDuration may itself also be of interest. I don't see a good reason for a blanket mutex here. It seems to have come in bids-standard/bids-specification@f0e1b1c as part of bids-standard/bids-specification#1410. One thing I note is that the prior code was explicit about applying to mri/func, whereas with the new code it's implicit by residing in the func.yaml file; don't know if that has anything to do with a rule intended for fMRI being applied here? Or maybe it's deliberately applied to DWI as well, but is perhaps then defined in the wrong file?

The reason for this was to avoid duplication and thus possible inconsistency. We could suppress the warning if the AcquisitionDuration implied by SliceTiming matches the actually found duration.

EDIT: I should add that these fields were both present in unmodified dcm2niix outputs. So the mutex violation isn't due to a manual modification on my part.

* [ ]  Resolve:
  > A data file's JSON sidecar is missing a key listed as recommended. (SIDECAR_KEY_RECOMMENDED)
  > ./sub-01/dwi/sub-01_S0map.nii
  > Evidence: missing InstitutionName as per rules.sidecars.mri.MRIInstitutionInformation

  I was asked to remove this; so I'll replace it with eg. the string `"redacted"`.

The schema validator warns on all missing RECOMMENDED fields. It's going to get very noisy, but it's unclear what the purpose of RECOMMENDing is if not to warn. I think we should demote a lot of RECOMMENDED to OPTIONAL, as most do not impact the ability to interpret a dataset.

Lestropie commented 1 month ago

This is a bug.

Not high priority; just curious if it's something that many had encountered previously and there was an accepted solution.

This should be fixed by https://github.com/bids-standard/bids-validator/pull/2079. Can you try the latest (pre-release, so let me know if you need instructions) version?

Was not successful re-attempting based on prior instruction. Certainly got a different set of warnings upon updating / reloading, but core issue (claiming files with new suffixes are violations) still there. Might need an expert hand, there's too many moving parts here with which I'm unfamiliar.

The reason for this was to avoid duplication and thus possible inconsistency. We could suppress the warning if the AcquisitionDuration implied by SliceTiming matches the actually found duration.

While it's possible that in some circumstances there may be duplication, I don't think that's always the case.

Consider the DWI acquisition here:

  1. The time required to acquire each volume is 4.5s, and 31 volumes were acquired. That's 139.5s of time taken to acquire the exported data. AcquisitionDuration is however 195.33s. There's extra time required before image data read can begin: achieve magnetisation steady-state for single-band pulses, acquire GRAPPA calibration data, acquire SMS calibration data, achieve magnetisation steady-state for multi-band pulses.

  2. The above reasoning is based on RepetitionTime. For 4D series, SliceTiming encodes the differences in commencement of acquisition of different slices within a volume. While you could take a reasonable guess at the time required to acquire each volume based on the contents of SliceTiming, it's not guaranteed (eg. possible to have a long pause between the last slice group in the volume and the first slice group in the next volume).

effigies commented 1 month ago

Testing your updated schema in bids-standard/bids-specification#1864 and the latest validator (master branch):

Output ```console ❯ BIDS_SCHEMA=file:///$HOME/Projects/bids/specification/src/schema.json deno run -A https://github.com/bids-standard/bids-validator/raw/master/bids-validator/src/bids-validator.ts --ignoreNiftiHeaders dwi_deriv [WARNING] The BIDSVersion field of 'dataset_description.json' does not match a known release. The BIDS Schema used for validation may be out of date. (UNKNOWN_BIDS_VERSION) ./dataset_description.json Evidence: schema.rules.rules.checks.dataset.UnknownVersion 1 more files with the same issue Please visit https://neurostars.org/search?q=UNKNOWN_BIDS_VERSION for existing conversations about this issue. [WARNING] The 'Authors' field of 'dataset_description.json' should contain an array of values - with one author per value. This was triggered based on the presence of only one author field. Please ignore if all contributors are already properly listed. (TOO_FEW_AUTHORS) ./dataset_description.json Evidence: schema.rules.rules.checks.hints.TooFewAuthors 1 more files with the same issue Please visit https://neurostars.org/search?q=TOO_FEW_AUTHORS for existing conversations about this issue. [ERROR] Empty files not allowed. (EMPTY_FILE) ./sub-01/dwi/sub-01_ADC.nii ./sub-01/dwi/sub-01_FA.nii ./sub-01/dwi/sub-01_S0map.nii 7 more files with the same issue Please visit https://neurostars.org/search?q=EMPTY_FILE for existing conversations about this issue. [WARNING] A data file's JSON sidecar is missing a key listed as recommended. (SIDECAR_KEY_RECOMMENDED) ./sub-01/dwi/sub-01_ADC.nii Evidence: missing InstitutionAddress as per schema.rules.rules.sidecars.mri.MRIInstitutionInformation ./sub-01/dwi/sub-01_FA.nii Evidence: missing InstitutionAddress as per schema.rules.rules.sidecars.mri.MRIInstitutionInformation ./sub-01/dwi/sub-01_S0map.nii Evidence: missing InstitutionAddress as per schema.rules.rules.sidecars.mri.MRIInstitutionInformation 8 more files with the same issue Please visit https://neurostars.org/search?q=SIDECAR_KEY_RECOMMENDED for existing conversations about this issue. [ERROR] 'SliceTiming' is defined for this file. Neither 'DelayTime' nor 'AcquisitionDuration' may be defined in addition. (SLICE_TIMING_AND_DURATION_MUTUALLY_EXCLUSIVE) ./sub-01/dwi/sub-01_dwi.bval Evidence: schema.rules.rules.checks.func.SliceTimingAcquisitionDurationMutex ./sub-01/dwi/sub-01_dwi.bvec Evidence: schema.rules.rules.checks.func.SliceTimingAcquisitionDurationMutex ./sub-01/dwi/sub-01_dwi.nii Evidence: schema.rules.rules.checks.func.SliceTimingAcquisitionDurationMutex 3 more files with the same issue Please visit https://neurostars.org/search?q=SLICE_TIMING_AND_DURATION_MUTUALLY_EXCLUSIVE for existing conversations about this issue. Summary: Available Tasks: Available Modalities: 18 Files, 21.4 kB MRI 1 - Subjects 1 - Sessions If you have any questions, please post on https://neurostars.org/tags/bids. ```

So the SLICE_TIMING_AND_DURATION_MUTUALLY_EXCLUSIVE is the main issue. On that note, it seems that dcm2niix is producing an AcquisitionDuration that does not follow the BIDS usage, which is specifically the time it takes to acquire a single volume. The BIDS AcquisitionDuration use case is quite niche, for trigger-gated acquisitions that result in an irregular (or at least non-constant) interval between volume onsets.

I presume dcm2niix is reproducing a DICOM field of the same name, and probably has been doing so since before BIDS defined this field.

In any case, I don't see a way to resolve this in BIDS without breaking backwards compatibility. I think the thing to do is just filter it from your JSON files.

effigies commented 1 month ago

I reread the BIDS and DICOM sections, and it looks like the inconsistency is internal to BIDS, not between the two specs: https://github.com/bids-standard/bids-specification/issues/1906

I'm removing the SLICE_TIMING_AND_DURATION_MUTUALLY_EXCLUSIVE check, since a rereading also showed that's actually not required by the spec: https://github.com/bids-standard/bids-specification/pull/1905