bids-standard / bids-specification

Brain Imaging Data Structure (BIDS) Specification
https://bids-specification.readthedocs.io/
Creative Commons Attribution 4.0 International
277 stars 162 forks source link

Make `*ChannelCount` metadata OPTIONAL instead of RECOMMENDED #1277

Open sappelhoff opened 2 years ago

sappelhoff commented 2 years ago

For example, the EEGChannelCount metadata should be an integer that corresponds to the count of channels with type EEG in the channels.tsv file.

This metadata is currently not validated with bids-validator (and never has been): https://github.com/bids-standard/bids-validator/issues/673

The basic validation step would be to "count" the channel types in channels.tsv and compare it to the metadata in the *.json files. This would be possible to implement, and we can (should) add this feature in the future.

However, here I am advocating (nudged by @effigies) for making the *ChannelCount metadata OPTIONAL instead of RECOMMENDED throughout the spec. Reasons apart from those outlined in https://github.com/bids-standard/bids-specification/pull/802#pullrequestreview-1088948113 include that it is inconsistent to have the *ChannelCount metadata only for some specific channel types, and not all channel types. This has lead to misunderstandings in the past, but unfortunately I cannot find the issue(s) just now.

Practically the proposed (backward compatible!) change has only an impact on how future validation proceeds:

I think the second option is more appropriate.

rob-luke commented 2 years ago

Thanks @sappelhoff I responded over at https://github.com/bids-standard/bids-specification/pull/802#discussion_r967714688 but I will repeat here for convenience

Thanks @sappelhoff, good suggestion.

I find that the NIRSChannelCount is one of the most important fields. When discussing datasets and individual measurements, the first question I get from collaborators is usually "how many channels does it have?". Given that this information is always available in the underlying data file, and the broad usage of this field I think it should be required.

Counting the individual rows in the channels.tsv file is not human-friendly, particularly when there are thousands of channels. Further, the number of sources and detectors is not easily read by a human (e.g. it is not always half the number of channels, you can have 60 sources, 100 detectors, and 2000 channels).

I think we should keep the core fNIRS ChannelCountfields as REQUIRED, and the auxiliary types, such as GYRO, as OPTIONAL. I understand your concern about mixing requirements, but I think the usefulness of the BEP will be greatly reduced if this information is not stored explicitly in metadata.

robertoostenveld commented 2 years ago

Considering

I think we should keep the core fNIRS ChannelCountfields as REQUIRED

This consideration might extend to each of the core channels types for each of the modalities (alsop MEG, EEG, iEEG). In practice I find it often more difficult to determine the proper channel count for the aux channels, as those are less consistently named.

effigies commented 2 years ago

As @sappelhoff noted, there are no current checks in the validator. Is this simply a documentation question, or would people expect it to be an error if the count does not match what's in channels.tsv?

For NIRS, it could be made an error without running the risk of breaking existing datasets. For other channels, it would need to be a warning, but the same code could be reused for this case.

sappelhoff commented 2 years ago

Is this simply a documentation question, or would people expect it to be an error if the count does not match what's in channels.tsv?

I would consider this an error.

rob-luke commented 2 years ago

I would also consider this an error