spine-generic / data-multi-subject

Multi-subject data for the Spine Generic project
Creative Commons Attribution 4.0 International
22 stars 15 forks source link

Lint .tsv #86

Closed kousu closed 3 years ago

kousu commented 3 years ago

Fixes https://github.com/spine-generic/data-multi-subject/pull/57#issuecomment-702247465

This can detect three problems:

It doesn't directly detect using spaces instead of tabs, but if you do that you'll probably trip the incorrect count or the trailing whitespace detector.

Example usage: After intentionally corrupting participants.tsv in each of the three ways:

$ .github/workflows/lint-tsv participants.tsv 
extraneous whitespace, line 2, column 4: '2019-02-12   '
errors in line 2: 
    'sub-amu01  M   28  2019-02-12      amu AMU - CEMEREM   Siemens Verio   NeckMatrix  syngo_MR_B17    Virginie Callot'

empty field, line 3, column 7. Please use '-' for null values.
errors in line 3: 
    'sub-amu02  M   28  2019-02-13  amu AMU - CEMEREM       Verio   NeckMatrix  syngo_MR_B17    Virginie Callot'

empty field, line 4, column 8. Please use '-' for null values.
errors in line 4: 
    'sub-amu03  F   28  2019-02-13  amu AMU - CEMEREM   Siemens         NeckMatrix  syngo_MR_B17    Virginie Callot'

incorrect number of columns, line 6
errors in line 6: 
    'sub-amu05  F   39  2019-03-01  amu AMU - CEMEREM   Siemens Verio   syngo_MR_B17    Virginie Callot'

It's already added to .gitub/workflows/validator.yml to catch future glitches introduced by the Github web editor or whatever.

This is on top of #87 so please review + merge that first.