bids-standard / bids-specification

Brain Imaging Data Structure (BIDS) Specification
https://bids-specification.readthedocs.io/
Creative Commons Attribution 4.0 International
278 stars 162 forks source link

Convergence on time series and events #713

Open effigies opened 3 years ago

effigies commented 3 years ago

It came up in a maintainers meeting today that the problem of non-neural time series is coming up repeatedly, and we should coordinate to ensure there isn't a proliferation of different solutions to similar problems.

BIDS currently has physio.tsv.gz and stim.tsv.gz formats, and the following BEPs all propose some kind of time series (regular sampling) or events (annotations or time-stamped samples):

Perhaps we can schedule a meeting in the next few weeks where we can discuss our use cases and solutions and see if there's some common ground?

cc @bids-standard/bep_leads

sappelhoff commented 3 years ago

cc @adam2392 @hoechenberger @jasmainak @agramfort

ChristophePhillips commented 3 years ago

Sounds like a good idea.

Has anyone already mentioned actigraphic data? or is this part of BEP029, though it seems more focused on actual 3D motion tracking? Poke @ghammad.

guiomar commented 3 years ago

Yes! Worth exploring an agreement, thanks @effigies :)

I think of two possible options, probably there are many more: 1) Create a new entity that can be called "timeseries" so all these options can be organized under it 2) Keep "timeseries" as a suffix and document these options under for example modality (mod) or any other entity

nicholst commented 3 years ago

This reference, from a INCF task force, might stimulate discussion... it's a requirements document, and surely is over-kill, but might remind us of some issues we'd otherwise forget.

Teeters, J., Benda, J., Davison, A., Eglen, S., Gerkin, R. C., Grethe, J., … Wark, B. (2016). Requirements for storing electrophysiology data. Retrieved from http://arxiv.org/abs/1605.07673

sappelhoff commented 3 years ago

eyetracking BEP --> cc @DejanDraschkow @greckla @bgagl

effigies commented 3 years ago

What would be the best way to set up a meeting? Are certain days or weeks more available to people?

DejanDraschkow commented 3 years ago

I am getting into a mode dense teaching stretch now, so might just try to jump on something that you guys agree on! Generally, Monday, Wednesday, and Friday mornings (UK time) are flexible... But @bgagl is the important attendee from our group!

guiomar commented 3 years ago

Maybe next week before or after the BIDS maintainers meeting? (18:45 CET, Tuesday)

greckla commented 3 years ago

Monday / Wednesday morning would work for me, Tuesday before the maintainers meeting is ok too...

melanieganz commented 3 years ago

When is the maintainers meeting on Tuesday? Monday thru Friday anytime in the evening after 8:30 pm CET is fine for me.

CPernet commented 3 years ago

same as Melanie for me

yarikoptic commented 3 years ago

FWIW (in case I don't join -- could not resist to express my invaluable thoughts):

BIDS currently has physio.tsv.gz and stim.tsv.gz formats, and the following BEPs all propose some kind of time series (regular sampling) or events (annotations or time-stamped samples):

so we also have _events.tsv. And 'time series' are inherent to any functional/eeg/etc data. Overall I see .tsv (+ .json) to just be a generic container for a tabular data whenever no common better, accepted/expected by tools, format exists for the data type. E.g. func's .nii.gz are time series which could be serialized into .tsv.gz as well but it would be counter-productive. With that in mind, if time series data for some data domain has some "open" (and more efficient) format already understood by the tools, it might be chosen in favor of a generic container such as .tsv. Having said that, indeed formalizing .tsv structure for dense (series) and sparse (events) time series should indeed be done. Or may be even considering to adopt some generic binary format with nice properties (indexable, compressed, easy to support in software, etc) to mitigate shortcomings of the text based .tsv for many such data types wherever staring at the data as serialized to ascii within tsv is not really that useful ;-)

effigies commented 3 years ago

How about 8:30pm CET/2:30pm EST/11:30am PST on Tuesday, Feb 16?

greckla commented 3 years ago

i would not be able to join on tue. hope am in the next time! if you meet keep me updated!

robertoostenveld commented 3 years ago

I should be able to attend next Tuesday.

Perhaps this is also a good moment+time to share this document "BIDS bigger picture" document with some thoughts that I wrote down some time ago. Some people have already seen it and commented, but it has not been widely been distributed yet. Feel free to distribute and use it for discussion. Although it does not concretely touch upon time-series and events, I do think that it is related since it deals with different brain and non-brain signals.

effigies commented 3 years ago

As we don't have an attendee list to send a private video chat link to, I will post a link in this thread half an hour before the call and hope we don't get Zoom-bombed.

Does Jitsi work for everyone?

melanieganz commented 3 years ago

Yes! Looking forward to discussing this tonight.

hoechenberger commented 3 years ago

Does Jitsi work for everyone?

I guess this is alright, although I've had mixed experiences with larger groups (say, >6 or so). But this may just be me :)

robertoostenveld commented 3 years ago

If jitsi turns out not to work well enough tonight, I can share a zoom invite.

effigies commented 3 years ago

Okay, we'll meet at https://meet.jit.si/bids-timeseries in 30 minutes.

effigies commented 3 years ago

Meant to collect use cases earlier, but times are busy. Here's a working doc, and I invite people to add their use cases before the meeting if possible.

effigies commented 3 years ago

For posterity, here's the chat log: https://gist.github.com/effigies/55e31df4e86fa65de7b66738c80bc551

To summarize the situation (see the working doc for more detail):


We have three general types of time series data that are currently supported or will be, as of BEP-009 (PET):

1) Events, defined by onsets and durations 2) Time series, defined by a start time and sampling rate 3) Time-stamped samples, defined by a time column

To approach this systematically, there are the following concrete steps to take:

1) Set out a list of terms to describe temporal data, such as regular/irregular sampling, epoched data, etc. This should be an appendix or addition to common principles. 2) Expand the notion of "TimeZero" to modalities other than PET, with defaults consistent with historical usage. Make these defaults explicit for existing time series, such as BOLD. 3) Explicitly describe the existing data types to reduce confusion for tool writers and new BEP authors.

Points to consider further:

1) How to handle epoched/discontinuous time series. 3) Determine whether the association of a time series in a TSV column with a channel can be accommodated with the existing column descriptions or needs additional fields. 2) Dense or high-frequency time series may justify revisiting #197. Resistance to a bare-bones HDF5 schema seems to be crumbling.