bids-standard / bids-specification

Brain Imaging Data Structure (BIDS) Specification
https://bids-specification.readthedocs.io/
Creative Commons Attribution 4.0 International
272 stars 156 forks source link

"Add suffixes" to schema? #1050

Open yarikoptic opened 2 years ago

yarikoptic commented 2 years ago

https://github.com/bids-standard/bids-specification/pull/1049 (and recently merged #1036) inspired me to look at re-used suffixes which ATM might have duplicate and varying specification (in terms of extensions) across different data types:

$> git grep -e '^- suffixes' -A1 -h | grep -v -e 'suffixes' -e -- | sort | uniq -c | sort -n | grep -v '^ *1 '
      2   - electrodes
      3   - channels
      3   - coordsystem
      4   - photo
      5   - meg
      6   - events
      7   - physio

NB due to -A1 could be an incomplete list

so I thought that may be such definitions could somehow also be centralized?

and alternative to centralization could be just a unittest which verifies that identical suffixes, which are present in different data types, have identical extensions across datatypes etc?

or may be it is just impossible anyways since some suffixes (like `meg`) might have quite complicated rules (but seems specific to a modality, hence I added "in different data types" above ```shell rules/datatypes/meg.yaml-# First group rules/datatypes/meg.yaml-- suffixes: rules/datatypes/meg.yaml: - meg rules/datatypes/meg.yaml- extensions: rules/datatypes/meg.yaml- - / # corresponds to BTi/4D data rules/datatypes/meg.yaml- - .ds/ rules/datatypes/meg.yaml- - .json rules/datatypes/meg.yaml- - .fif -- rules/datatypes/meg.yaml-# Specifically, it's dat files with "acq-calibration" rules/datatypes/meg.yaml-- suffixes: rules/datatypes/meg.yaml: - meg rules/datatypes/meg.yaml- extensions: rules/datatypes/meg.yaml- - .dat rules/datatypes/meg.yaml- entities: rules/datatypes/meg.yaml- subject: required rules/datatypes/meg.yaml- session: optional -- rules/datatypes/meg.yaml-# fif files with "acq-crosstalk" rules/datatypes/meg.yaml-- suffixes: rules/datatypes/meg.yaml: - meg rules/datatypes/meg.yaml- extensions: rules/datatypes/meg.yaml- - .fif rules/datatypes/meg.yaml- entities: rules/datatypes/meg.yaml- subject: required rules/datatypes/meg.yaml- session: optional ```

edit: upgraded grep as of v1.7.0-329-gd3429fba

$> git grep -e '^ *suffixes *:' -A1 -h | grep -v -e 'suffixes' -e -- | sort | uniq -c | sort -n | grep -v '^ *1 '
      2   - electrodes
      3   - channels
      3   - coordsystem
      3   - meg
      4   - mask
      4   - photo
      4   - probseg
      6   - dseg
      6   - events
      7   - physio
tsalo commented 2 years ago

I believe that this came up at last week's schema meeting, and we were going to look at repurposing #1012 to centralize duplicate suffix patterns across datatypes.

effigies commented 1 year ago

@yarikoptic Would you mind having another look? This has changed significantly:

$ git grep -e ' suffixes:' -A1 -h | grep -v -e 'suffixes' -e -- | sort | uniq -c | sort -n | grep -v '^ *1 '
      2     - sessions
      4     - mask
      4     - probseg
      6     - dseg
      6     - meg