bids-standard / bids-specification

Brain Imaging Data Structure (BIDS) Specification
https://bids-specification.readthedocs.io/
Creative Commons Attribution 4.0 International
272 stars 156 forks source link

Specify 0 vs 1 based indexing within BIDS #499

Open sappelhoff opened 4 years ago

sappelhoff commented 4 years ago

Problem

The sample column in events.tsv encodes the following:

OPTIONAL. Onset of the event according to the sampling scheme of the recorded modality (i.e., referring to the raw data file that the events.tsv file accompanies).

As identified by @robertoostenveld, we do not specify whether we use 0-based or 1-based indexing

This should be clarified.

potential solution

I feel like going with 1-based indexing would make sense if most recommended data formats that also encode samples go with it. For example, most EEG data formats have a representation of "samples".

(BrainVision uses 1-based sample indexes in their datafiles (page 14 in the spec)

How about data formats in iEEG and MEG? ... is this relevant for MRI or other data types?

There is also the argument that many people are not familiar with 0-based indexing.

effigies commented 4 years ago

It could be relevant to MRI, though it's unusual. I would interpret this as a "volume index" instead of event onset in seconds, and so I would be heavily inclined toward 0-indexing because that would be how you index a multidimensional array in pretty much everything but MATLAB.

My top criterion would be consistency. Within the standard, we should either always use 0-indexing or 1-indexing. I don't know that we currently have a situation where we impose one or the other, but if we do, we should stick with it. If we don't, again I would advocate 0-indexing for simplicity's sake.

satra commented 4 years ago

just some fyi: run and echo have indices but no base is specified. most datasets i have seen use 1+. there is mention of slice index 0 in a piece of bids.

dorahermes commented 4 years ago

Most iEEG datasets I have used are also 1-based sample indexes.

effigies commented 4 years ago

Re @satra's comment:

Right, the <index> values in filename entities are just numerical. They aren't even required to be sequential, so I wouldn't consider them to be precedent for either option.

Here's the "slice index zero" text:

A - sign indicates that the contents of SliceTiming are defined in reverse order - that is, the first entry corresponds to the slice with the largest index, and the final entry corresponds to slice index zero.

I think this could be changed to "the slice with the smallest index" without losing any specificity, as we encode a series of values and not their indices, so I also wouldn't consider this precedent.

ddwagner commented 3 years ago

Hi all, chiming in on this issue to say I would love it if the BIDS spec was explicit about the starting position for the onsets sample column. Given that this is described as an onset, I think it should start at 0 not 1. If we had an event at time 0seconds the resulting onset in the units of a functional MRI would be 0 as well. At least, this is convention when modelling data using onsets in "scans" in SPM.

sappelhoff commented 2 years ago

Triggered by @VisLab's comment in #1043, I am reading this again. There are two problems:

  1. the concrete problem of how to interpret the optional sample column in events.tsv
  2. the general problem that nowhere in the BIDS spec do we specify a consistent rule of which indexing to use (0+ or 1+)

Regarding 1.:

OPTIONAL. Onset of the event according to the sampling scheme of the recorded modality (i.e., referring to the raw data file that the events.tsv file accompanies).

Reading this again, I think our current rule means that the same indexing is to be used as in the raw data file. For example BrainVision EEG data files use 1+ indexing (section 4.3, second sentence in their data format specification), so obviously having a 0 in the sample column of an events.tsv file that is paired with a BrainVision EEG file does not make any sense.

However, this is still problematic when the events.tsv file is relevant to multiple neuro (or physio / beh) data files, if they use different indexing styles (0+ or 1+)

And furthermore it does not address the general problem that we don't declare an indexing style across BIDS. I am not sure whether we could even do that in a backwards compatible way. Perhaps the most we can do would be to introduce a new metadata field at some meaningful level that could advertise 0+ vs 1+ indexing, and RECOMMEND that field.

It's all a bit unsatisfactory.