bids-standard / bids-validator

Validator for the Brain Imaging Data Structure
https://bids-standard.github.io/bids-validator/
MIT License
1 stars 4 forks source link

overhaul units validation in bids-validator #71

Open sappelhoff opened 4 years ago

sappelhoff commented 4 years ago

we are currently in the final review round of a PR in bids-specification that will RECOMMEND CMIXF formatting of SI units: https://github.com/bids-standard/bids-specification/pull/411

Once that PR is merged, we should "switch off" units validation as it is currently and then start re-implementing / changing the way we validate to be in line with the specification again.

Specifically, units should be checked whether they conform to CMIXF-12 (+ the 5 unicodes we include for backward compatibility) ...

outdated, because the PR changed

~We also need to consider cases where SI units are impossible -> in these cases, users can specify their units in JSON sidecars. (This needs to be covered by the tests, i.e., IF invalid SI unit found, but unit specified in JSON, do not raise ... else send error "please format as CMIXF")~

~Lastly, the validator should alert users when the INVALID unicode U+03BC (μ) is used instead of the VALID U+00B5 (µ). The warning should include something like "please take extra care ... these characters look the same, but are different"~

sappelhoff commented 4 years ago

@rwblair will your new package https://github.com/poldracklab/cmixf-js be easily integrated into bids-validator?

rwblair commented 4 years ago

Should be. It's published it on npm. Don't have time at the moment but I'll make a PR for the validator to try and use it next week.

sappelhoff commented 4 years ago

bids-standard/bids-specification#411 is now merged, we should "deactivate" current unit validation ASAP @rwblair

DimitriPapadopoulos commented 3 years ago

Uncommenting the SI units validation code would fix this LGTM.com recommendation: https://lgtm.com/projects/g/bids-standard/bids-validator/snapshot/18d01f82b84c4f67c48922e13928940176011ff4/files/bids-validator/validators/tsv/tsv.js?sort=name&dir=ASC&mode=heatmap#x247409e34f5d77de:1 https://github.com/bids-standard/bids-validator/blob/abb700c79b6d79683598a394aea5e0573b33412b/bids-validator/validators/tsv/tsv.js#L309-L336

DimitriPapadopoulos commented 3 years ago

I have created a PR to uncomment the SI units validation code.

It's been sitting unused for a long time. Does it need some overhauling or more testing?

sappelhoff commented 3 years ago

yeah it needs both overhauling and testing.

We would want the unit checks to:

  1. see if a unit anywhere in a BIDS metadata file is in CMIXF
  2. if it is, all fine
  3. else: warn, that CMIXF formatting is RECOMMENDED
  4. possibly even provide a suggestion for CMIXF, e.g., a user supplies µV, the validator then recommends to reformat as uV
DimitriPapadopoulos commented 3 years ago

One problem is µ instead of u for micro (actually both µ and u are valid for backwards compatibility): https://github.com/bids-standard/bids-validator/blob/abb700c79b6d79683598a394aea5e0573b33412b/bids-validator/utils/unit.js#L92

DimitriPapadopoulos commented 3 years ago

Another one is instead of Ohm (again both and Ohm are valid for backwards compatibility): https://github.com/bids-standard/bids-validator/blob/abb700c79b6d79683598a394aea5e0573b33412b/bids-validator/utils/unit.js#L37

rwblair commented 3 years ago

So we recommend both SI and the stricter CMIXF. But neither are required. But what is required is to have a units entry in a sidecar if the units are not SI.

@sappelhoff is my understanding of the specs unit guidelines correct? Is there an explicit example of documenting non SI units in a sidecar? https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/09-positron-emission-tomography.html#example-blood-data

Above example shows units in a sidecar as required by PET part of spec. Could imagine documenting non SI units used in TSV files would be done similarly.

bids-standard/bids-specification#885 goes a ways towards indicating fields that should be validated as units and should be used once its merged. Unfortunately the regex for validating both SI/CMIXF is so complicated that ever putting it in the schema may be out of the question.

I haven't found a standard js library for validating SI units so we should keep using the code we have for that. I updated the old cmixf-js package so thats still an option doesn't have any dependencies and fortunately I don't think CMIXF-12 is going to be changing. I might move that over to bids-standard instead of keeping it under poldracklab account.

sappelhoff commented 3 years ago

is my understanding of the specs unit guidelines correct?

yes :+1:

Is there an explicit example of documenting non SI units in a sidecar?

I haven't seen any

Could imagine documenting non SI units used in TSV files would be done similarly.

I agree

I updated the old cmixf-js package so thats still an option doesn't have any dependencies and fortunately I don't think CMIXF-12 is going to be changing. I might move that over to bids-standard instead of keeping it under poldracklab account.

Nice