bids-standard / bids-validator

Validator for the Brain Imaging Data Structure
https://bids-standard.github.io/bids-validator/
MIT License
181 stars 109 forks source link

Remove warning for tsv data files without headers (motion/physio...) #1627

Open sjeung opened 1 year ago

sjeung commented 1 year ago

Hi everyone,

I noticed that .tsv data without headers (for example this one) lead to warning [Code 82] CUSTOM_COLUMN_WITHOUT_DESCRIPTION (see this PR for context). A few modalities including motion (currently a BEP) and physio use these headerless tsvs as data format and it would make sense to remove this warning for files that end with _motion.tsv or _physio.tsv.

Could you have a look into this @rwblair? According to the PR comments thread linked above this shouldn't be a blocking issue for progressing with motion community review.

effigies commented 1 year ago

I guess we should think through how this interacts with the schema. Right now, .tsv files have in-file headers, and .tsv.gz files have in-sidecar headers (sidecar.columns). Looking at the example above, it would seem to be associations.channels.name.

We could think of writing it this way:

MotionTSV:
  selectors:
    - datatype == 'motion'
    - suffix == 'motion'
    - extension == '.tsv'
  header: associations.channels.name

However, that assumes that this rule would be interpreted before any other validation that might need to associate data with headers or know the number of columns. Another approach could be to add new definitions:

- suffix: physio
  header: sidecar.channels
- suffix: stim
  header: sidecar.channels
- suffix: motion
  header: associations.channels.name
- suffix: '*'
  header: inline