bids-standard / bids-specification

Brain Imaging Data Structure (BIDS) Specification
https://bids-specification.readthedocs.io/
Creative Commons Attribution 4.0 International
272 stars 156 forks source link

Precedence in JSON/NIfTI metadata mismatch #138

Open effigies opened 5 years ago

effigies commented 5 years ago

Over in bids-standard/pybids#357, we've been having a small discussion over the precedence of metadata when JSON sidecars and NIfTI disagree. Specifically the question was whether to take the 4th pixdim from the NIfTI or the RepetitionTime entry in the BIDS metadata, which only matters when there is disagreement.

It is the case that, at least for RepetitionTime, a mismatch is treated as an error by the validator. However, in practice, not every dataset is run through the validator before being exposed to pybids or pushed through a BIDS App... In user support for fMRIPrep, we've seen enough issues that would be solved by running through the validator that we now bundle the validator and run it before starting fMRIPrep.

There are other metadata entries that do not produce errors on disagreement, such as PhaseEncodingDirection and SliceEncodingDirection (packed in the dim_info field), SliceTiming (slice_code), which thus need to have a defined precedence (or begin producing errors).

fMRIPrep for one has made a policy of only querying JSON for pretty much everything besides voxel spatial dimensions, affines and data shape, and this seems like the correct approach for BIDS-aware applications. But it is also the case that we have a number of BIDS Apps that are just shims over existing non-BIDS-aware apps, which will thus only have NIfTI headers to rely upon, except where the shim writer also feeds in JSON metadata through another side channel.

I don't really have a satisfying proposal to make here, so this is mostly the start of a discussion. My inclination is that we should continue to prioritize JSON-encoded metadata, and increase the coverage of potential header/sidecar conflicts in the validator. In addition to the idea of respecting BIDS as a structure is the fact that JSON is easier to inspect and edit by hand, whereas modifying NIfTI headers requires more knowledge of some tool that can do the job. But making this a spec requirement may render some existing BIDS Apps non-compliant.

I would also suggest that a useful tool (possibly validator mode) would be to sync JSON metadata into NIfTI headers to resolve mismatches.

This may be somewhat related to #102.

satra commented 5 years ago

the other bit that's related to another discussion we had in bep001 is that the nifti header for time can be in different units.

i do agree with @yarikoptic that perhaps tools that write out both json and nifti should make them consistent, but the tool will need to modify the nifti header time units to sec to match with the json and will simply not be able to adjust it for the additional time fields being added in bep001.

so my vote is for the time being to let the json file take precedence and be the arbiter of the values. i.e. an app should not even look at the nifti header (for the raw data). how this evolves for derivatives remains to be seen. we should think about a test suite for apps to see if they pass compliance with the spec.

chrisgorgo commented 5 years ago

There are other metadata entries that do not produce errors on disagreement, such as PhaseEncodingDirection and SliceEncodingDirection (packed in the dim_info field), SliceTiming (slice_code), which thus need to have a defined precedence (or begin producing errors).

It might be worth opening issues for adding those consistency checks at https://github.com/bids-standard/bids-validator/issues.

I would vote for requiring header/JSON consistency in the spec (as it is now) and not specifying which takes precedence (since this suggest that consistency requirement is not very serious). FMRIPREP now runs the validator before starting any processing which should help resolving such issues.

effigies commented 5 years ago

Requiring consistency is likely to invalidate existing datasets, as it's very common to have unset specify dim_info or slice_code. So perhaps we should say "if set, must be consistent"?

chrisgorgo commented 5 years ago

Good point. For those fields, the spec does not talk about consistency with the headers so the spec would have to be modified before the validator. However, we might be splitting hairs here since:

satra commented 3 years ago

pinging @yarikoptic here - since this came up recently. was there ever a resolution to this in the spec?

sappelhoff commented 3 years ago

just chiming in to say that in mne-bids we are valueing BIDS metadata over data file embedded metadata. cc @hoechenberger @agramfort

effigies commented 3 years ago

Made a concrete proposal over at #761 to help move the discussion along.