CPernet commented 5 years ago

2 fields

add to dataset_description.json the field SimultaneousRecording to indicates the different imaging modalities acquired at the time e.g. SimultaneousRecording: 'func','EEG','eye tracker', 'physio', 'behav'.

add SimultaneousRecordingWith field within the sidecar files to indicates the different imaging modalities acquired at the time and on different hardware.

e.g. ds000117 SimultaneousRecording: {'MEG','EEG'} but no SimultaneousRecordingWith because meg and eeg data are in the same file and recorded together on the same hardware.

SimultaneousRecordingWith

Tibor solution is to point to all files related to each others: for instance in the run-02_bold.json we would have SimultaneousRecordingWith = { 'eeg/sub-01_ses-01_task-something_run-02_eeg.vhdr', 'eyetracker/sub-01_ses-01_task-something_run-02_eyetracker.asc'}.

Chris G proposed to add the relative root path (e.g. {'/sub-01/func/sub-01_task-something_run-02_bold.nii.gz']) to accommodate hyperscanning (simultaneous recordings across participants).

Robert pointed out that 'It might also be relevant to know whether behaviour (behav) was recorded during the functional brain recordings (e.g. func+behav), or prior to (or after) the functional scan. It is often assumed (e.g. in the main bids spec) that these are recorded simultaneously, but e.g. section 8.7 and 8.8 already have to deal with the simultaneous versus sequential (or separate) measurement of the two. When considering SimultaneousRecording as a general field, it might have the side effect of “behav" becoming a mature data type on its own, rather than an addition to functional brain data. This would make it more symmetric'.

Mainak has an open issue: should we split concurrent recordings: eg split the meg and eeg data of ds000117 ; most felt it is not necessary nor recommended since it comes from the same hardware - one issue to solve is then ensure metadata cover all modalities properly e.g. both MEG and EEG without conflict (seems largely possible to me)

synchronization issue

Since we have multiple modalities, each one should have their own timing information recorded using events.tsv files. This is important since different frequency sampling exists for each modality and often clock on different hardware run a different speed. If there is no event (rest) we need at least one marker to ensure this is synchronized (a typical case in point is starting the eeg recording before the MRI starts).

Chris G pointed out that to compare events.tsv they should probably have the same number of rows since there is no unique identifier for events.

future format to be adopted like xdf contains multimodal data (say video screen cap, eye tracker, physio and eeg) with a master clock, and specific clocks -- again splitting files seems redudent and not necessary - but under which folder such file will appear ; my idea would be that's it the experimenter to know / decide and put it under the primary measure of the study.

CPernet commented 5 years ago

@chrisfilo @jasmainak @robertoostenveld @dorahermes

dorahermes commented 5 years ago

I like the idea of having SimultaneousRecording and SimultaneousRecordingWith

+1 on not splitting files

jasmainak commented 5 years ago

Thanks @CPernet for the detailed description. Isn't the SimultaneousRecording field redundant with what you can already find in channels.tsv ? It will tell you if the data contains both MEG and EEG channels for example ...

CPernet commented 5 years ago

For the rare case of MEG-EEG - if you acquired on two different systems (ie you have a separate EEG amplifier you would still have simultaneous recordings but two separate channels.tsv) there is no redundancy. If on the same system like ds00117 sure to some degree it is redundant - although IMO the global fieldSimultaneousRecording has a different objective (different from SimultaneousRecordingWith)

For the more generic cases, fMRI-EEG, EEG-eye tracker, PET-fMRI, etc .. the global field SimultaneousRecording allows easy indexing of for future search in BIDS data.

jasmainak commented 5 years ago

okay fair enough!

robertoostenveld commented 5 years ago

In the case of MEG + EEG acquired on two different systems, the two sampling rates would be different and the data would be split over to types (e.g. "sub-01/ses-combined/eeg" and "sub-01/ses-combined/meg"). That is identical to BOLD + EEG, which will always be recorded with different systems, or BOLD + Physio which may or may not be recorded with different systems.

I do not see the need for the SimultaneousRecording field. This would replicate the information in the XXXChannelCount fields (with XXX being the different types of data, e.g. EOG, ECG, EMG, AUDIO, EYEGAZE, PUPIL, EEG, MEG) in the meg.json, eeg.json and ieeg.json.

CPernet commented 5 years ago

It would necessarily replicate fields, think simultaneous PET MRI - same system but everything is different.

Of course within Jason files SimultaneousRecordingWith will point to each related files. To me the global field has indexing value to quickly make the difference between multimodal datasets acquired sequentially vs simultaneously.

-- Dr Cyril Pernet, Senior Academic Fellow Neuroimaging Sciences

Centre for Clinical Brain Sciences Chancellor's Building, Room GU426D The University of Edinburgh 49 Little France Crescent Edinburgh BioQuarter EH16 4SB

cyril.pernet@ed.ac.uk tel: +44 (0)131 465 9530 http://www.sbirc.ed.ac.uk/cyril http://www.ed.ac.uk/edinburgh-imaging

From: Robert Oostenveld notifications@github.com Sent: 12 November 2018 10:58:52 To: bids-standard/bids-specification Cc: PERNET Cyril; Mention Subject: Re: [bids-standard/bids-specification] SimultaneousRecording and SimultaneousRecordingWith (#86)

In the case of MEG + EEG acquired on two different systems, the two sampling rates would be different and the data would be split over to types (e.g. "sub-01/ses-combined/eeg" and "sub-01/ses-combined/meg"). That is identical to BOLD + EEG, which will always be recorded with different systems, or BOLD + Physio which may or may not be recorded with different systems.

I do not see the need for the SimultaneousRecording field. This would replicate the information in the XXXChannelCount fields (with XXX being the different types of data, e.g. EOG, ECG, EMG, AUDIO, EYEGAZE, PUPIL, EEG, MEG) in the meg.json, eeg.json and ieeg.json.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/bids-standard/bids-specification/issues/86#issuecomment-437838944, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AEjUDn5PZAIZWvolKu7VlNrDyaemjccGks5uuVRsgaJpZM4YXwUP.

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

yarikoptic commented 4 years ago

Additional 1c: There is also _scans files https://bids-specification.readthedocs.io/en/stable/03-modality-agnostic-files.html#scans-file acq_time field in which provides critical information about possible multiple acquisitions (overlapping in time). The tricky part only the need to compute durations per each. If we add field end_time or duration into scans file, we would gain ultimate way to figure out what was simultaneous with what, and what were possible temporal offsets between recordings etc

robertoostenveld commented 4 years ago

I doubt whether this should go at the level of dataset_description.json. It is something that is specific for scans (i.e. recordings) within a session. For example, in one subject (or one session) you may have recorded multiple streams simultaneously, and in another subject (or session) you may not have recorded them simultaneously.

In this example everything is coded in the scans.tsv file (which is specific for one session). It is clear that, due to the absence of duration information in that file, it is non-trivial to determine which ones are overlapping. But most of the "scans" have the RecordingDuration in their json. Only MRI (neither structural nor functional) does not have a metadata field that describes how long the whole recording took. For functional data you now have to load the nifti to determine the number of volumes, and multiply that with the RepetitionTime from the json.

In general I am not happy about replicating information. Using pybids or matlab-bids it should be possible to form a query that extracts the right information easily. Duplication means that the querying code gets more complex (because it has to check multiple locations) and that (meta)data can be inconsistent.

CPernet commented 4 years ago

@robertoostenveld in my mind, the point dataset_descrption comes down to help distinguish simultaneous recordings (any expect behaviour) so it's easy to catalogue dataset. Yes you can figure out by checking scan.tsv but to update your repository catalogue that's a bit of a pain IMO. Let's vote and close that issue?

robertoostenveld commented 4 years ago

how would it concretely look like? Could you demonstrate the proposed dataset_description.json for the two subject example here, which has a mix of simultaneous and non-simultaneous modalities?

CPernet commented 4 years ago

could not see a dataset_description.json file? so if we take https://github.com/bids-standard/bids-examples/blob/master/eeg_rest_fmri/dataset_description.json

we go from

{
    "Name": "EEG, fMRI and NODDI at rest'",
    "BIDSVersion": "v1.1.X",
    "License": " Creative Commons Attribution 4.0 International License",
    "Authors": [
        "F. Deligianni",
        "M. Centeno",
        "D.W. Carmichael",
        "G.H. Zhang",
        "C.A. Clark",
        " J.D. Clayden "
    ],
    "Acknowledgements": "Thanks to C.R. Pernet for preparing the dataset following BIDS",
    "ReferencesAndLinks": [
        "F. Deligianni, M. Centeno, D.W. Carmichael and J.D. Clayden (2014). Relating resting-state fMRI and EEG whole-brain connectomes across frequency bands. Frontiers in Neuroscience 8:258",
        "F. Deligianni, D.W. Carmichael, Gary H. Zhang, C.A. Clark and J.D. Clayden (2016). NODDI and tensor-based microstructural indices as predictors of functional connectivity. PLoS ONE 11(4):e0153404"
    ],
    "SourceDatasetsURLs": " https://osf.io/94c5t/"
}

to

{
    "Name": "EEG, fMRI and NODDI at rest'",
    "SimultaneousRecording": "EEG, BOLD, DWI'",
    "BIDSVersion": "v1.1.X",
    "License": " Creative Commons Attribution 4.0 International License",
    "Authors": [
        "F. Deligianni",
        "M. Centeno",
        "D.W. Carmichael",
        "G.H. Zhang",
        "C.A. Clark",
        " J.D. Clayden "
    ],
    "Acknowledgements": "Thanks to C.R. Pernet for preparing the dataset following BIDS",
    "ReferencesAndLinks": [
        "F. Deligianni, M. Centeno, D.W. Carmichael and J.D. Clayden (2014). Relating resting-state fMRI and EEG whole-brain connectomes across frequency bands. Frontiers in Neuroscience 8:258",
        "F. Deligianni, D.W. Carmichael, Gary H. Zhang, C.A. Clark and J.D. Clayden (2016). NODDI and tensor-based microstructural indices as predictors of functional connectivity. PLoS ONE 11(4):e0153404"
    ],
    "SourceDatasetsURLs": " https://osf.io/94c5t/"
}

with SimultaneousRecording using the names we use to describe each modality

dorahermes commented 4 years ago

I assumed that SimultaneousRecording is specific within a session, not at the project level. A more free form description could potentially exist in the dataset_description, but in the example above it could even mean that DWI and BOLD are acquired simultaneously.

Would it be an option to add optional columns to the _scans.tsv to cross-reference simultaneous scans (which would require simultaneous recordings to add an _scans.json file)

CPernet commented 4 years ago

oh you so right @dorahermes my bad

{
    "Name": "EEG, fMRI and NODDI at rest'",
    "SimultaneousRecording": "EEG, BOLD'",
    "BIDSVersion": "v1.1.X",
    "License": " Creative Commons Attribution 4.0 International License",
    "Authors": [
        "F. Deligianni",
        "M. Centeno",
        "D.W. Carmichael",
        "G.H. Zhang",
        "C.A. Clark",
        " J.D. Clayden "
    ],
    "Acknowledgements": "Thanks to C.R. Pernet for preparing the dataset following BIDS",
    "ReferencesAndLinks": [
        "F. Deligianni, M. Centeno, D.W. Carmichael and J.D. Clayden (2014). Relating resting-state fMRI and EEG whole-brain connectomes across frequency bands. Frontiers in Neuroscience 8:258",
        "F. Deligianni, D.W. Carmichael, Gary H. Zhang, C.A. Clark and J.D. Clayden (2016). NODDI and tensor-based microstructural indices as predictors of functional connectivity. PLoS ONE 11(4):e0153404"
    ],
    "SourceDatasetsURLs": " https://osf.io/94c5t/"
}

robertoostenveld commented 4 years ago

But with

"SimultaneousRecording": "EEG, BOLD'",

you are simplifying it too much. In the "POM" dataset (the one on the FT website but that is not shared) there is EMG recorded during the anatomical MRI and during the DWI (to control for quality, i.e. patients with severe tremor) and also during the functional MRI. Furthermore, there is eye tracker recorded during the functional MRI.

To be honest, I did not look with sufficient detail in the "POM" dataset to determine precisely which stream was recorded simultaneously with which, but there were 4 devices involved: presentation pc, eyetracker, brain amp and MR scanner. That results in general in such a pattern, where time is along the horizontal axis.

There is no way you can get that properly represented as a list with ['mri', 'eye', 'meg', 'beh'].

I furthermore agree with @dorahermes that correctly representing this is to be done within the session.

PS I pointed to the "POM" example, because I know it is a complex one. But in case it is a simple with two devices and for each one one recording (e.g. one EEG and one video), we don't really need to solve it: a sentence in the README is enough, and the acq_time can be used without problem. More precise documentation becomes relevant in more complex cases (e.g. three EEGs with two video files, or more than two modalities).

CPernet commented 4 years ago

yes, let's see with POM - we could use a list

{"T1w, EMG"}, {"DWI, EMG"}, {"EyeTracker, BOLD"},

I'm only suggesting something that would make it easy to search through dataset without having to look from scan.tsv files

yarikoptic commented 1 year ago

A few pointers from our chat with @robertoostenveld who reminded about this old issue which remains open and often referenced for a reason

discussion in google group : https://groups.google.com/u/0/g/bids-discussion/c/NeqFO-lJhsc/m/Ujg_HZUpBwAJ?pli=1 centered around XDF file specification (https://github.com/sccn/xdf/wiki/Specifications) and LSL (https://github.com/labstreaminglayer/App-LabRecorder) streams within XDF to allow to sync multiple clock in dataset
https://github.com/moeinrazavi/OpenSync project I found at OpenBehavior website -- worth reviewing. Recent publication describing it: https://www.sciencedirect.com/science/article/abs/pii/S0165027021003939?via%3Dihub
@robertoostenveld showed a convention he used in one of the studies to have _scans.tsv to be the location to provide "sync"ed times. On that regard I also submitted minor https://github.com/bids-standard/bids-specification/pull/1368 to clarify wording by 1 word
PET has unique metadata TimeZero to provide a reference time for the rest of times in the session
I (and I think @robertoostenveld agreed) that we need to be able to reflect on possible Offset + Slope in time conversions (so assume it to be linear) and in general do not bother for some unstable clocks etc
I think we cannot escape entirely from introducing a notion of "clocks", and establishing (linear) relationships between them to be able to provide better precision in aligning multimodal data (well -- nearly every BIDS dataset is "multimodal" as in including _events and stimuli/ which isn't typically shared, in my word).
I mentioned existence of a standard to represent "durations" https://en.wikipedia.org/wiki/ISO_8601#Durations in more human readable form than big numbers of seconds.

CPernet commented 1 year ago

re-reading that issue and with 'a few' datasets under my belt, I don't see the need for those fields this because we can figure out everything with scans.tsv
offset is exactly what TimeZero does for PET (move it in yaml to a generic info?)
yes clock business is needed at some point as high temporal sampling modalities do get out of synch ; maybe opening an issue for that special topic

bids-standard / bids-specification

SimultaneousRecording and SimultaneousRecordingWith #86

2 fields

SimultaneousRecordingWith

synchronization issue