Examples on how to read and process BIDS data with fieldtrip for the website

JojoVh commented 2 years ago

Dear Fieldtrip developers,

I really appreciate that Fieldtrip supports BIDS. I would like to automatically read / process / analyze a dataset in BIDS format, but I find little examples of which features the Fieldtrip Toolbox has to support that.

For instance, in the "Getting Started with BIDS" menu, it only gives examples on how to write BIDS with data2bids function. https://www.fieldtriptoolbox.org/getting_started/bids/

My question is: could you add information on the website in how to read a BIDS dataset 'itself' rather than file types?

(An example of what I mean you can find here:) https://mne.tools/mne-bids/stable/auto_examples/read_bids_datasets.html

Thank you very much Jonathan @ICNeuromodulate Charité Berlin

schoffelen commented 2 years ago

Feel free to extend the section: https://www.fieldtriptoolbox.org/getting_started/bids/#reading-data-from-a-bids-dataset

JojoVh commented 2 years ago

Feel free to extend the section: https://www.fieldtriptoolbox.org/getting_started/bids/#reading-data-from-a-bids-dataset

Thank you for your quick reply. I am happy to help to extend that section, but where can I find which features Fieldtrip has to read BIDS data? Maybe I need to clarify: Is there any "BIDS2data" function or "import BIDSpath" or "read_raw_BIDS" function "list_channel_tsv" "select_BIDS_files" you could suggest?

The section you showed me is referring to datatypes, but these are to my understanding independent of whether they were or were not written in BIDS format. e.g. reading a brainvision file is well explained, but how to get the BIDS path to the brainvision file or how to plot brainvision for all files of subjects with certain session "x"?

robertoostenveld commented 2 years ago

Hi Jonathan,

The documentation on reading and processing data that is organized in BIDS is indeed still too sparse, which is also due to it still being in development. We are (from time to time) making improvements to low level reading functions, but those do not really change the overall structure to adopt for implementing your pipelines with a complete analysis script that reads (and writes) data from/to a BIDS organized dataset. We would ideally show all aspects of data handling.

Please have a look at https://doi.org/10.1016/j.dcn.2021.101036 and the accompanying code on https://github.com/Donders-Institute/infant-cluster-effectsize that I wrote with @DidiLamers and @marlenemeyer. There we basically have three "phases"

convert the source data to BIDS raw
do the analysis
convert the results to BIDS derivatives

For the 1st we have quite some documentation on the FT website, using data2bids. For the 3rd I am still not so sure what we should precisely aim for, also due to lack of concrete progress on BEP021. I guess that (but correct me if I am wrong) your interest is mainly on the 2nd, i.e., if you have data in BIDS, what is the best strategy to organize your analysis.

We already discussed this once in a conference symposium where I presented https://github.com/robertoostenveld/Wakeman-and-Henson-2015. That was just prior to BIDS appearing on the scene, but the symposium resulted in a strong drive to adopt BIDS for MEG/EEG and resulted in the group analysis special issue. My Wakeman and Henson scripts never made it into the special issue, but the strategy is still one that I like (and that you can recognize in the developmental EEG paper).

Further developing all of this (the standard, the tooling/code, the documentation, the procedures, etc), takes time, and would benefit from input from people like you. What I can do here (i.e, in this issue) is to provide some further pointers. That might result in improved documentation. Where that might go, I am not sure yet.

robertoostenveld commented 2 years ago

Let me first give pointers to available documentation. There is the Getting started with BIDS, but there is also the Creating a clean analysis script introductory tutorial. There are also the madrid2019 and paris2019 workshop documents where BIDS plays a role.

robertoostenveld commented 2 years ago

Since the analysis script is in the hands of the user, FieldTrip has little control over how you loop over subjects, conditions, etc. But (nearly) all individual files in BIDS raw can be read directly in FieldTrip, as such processing BIDS converted data is not different from processing that data in some idiosyncratic organization.

But relevant to know is that that ft_read_header will read the channels.tsv and use that to (potentially) overrule the header information in the (binary) header. So that way you can fix channel names, types, etc. This of course extends to ft_preprocessing. Then the second important aspect is in events.tsv, which can be used with ft_read_event.m and thereby ft_definetrial.m. However, I don't know the details directly from the top of my head. There is also the ft_trialfun_bids.m that plays a role here.

At a slightly higher level you can also use ft_read_tsv.m and the corresponding write function. This allows you to read the participants.tsv and implement your loops. There are also ft_read_json.m and the corresponding write function.

I hope these pointers give you some direction...

robertoostenveld commented 2 years ago

Oh, and I should add that ft_trialfun_bids.m makes use of some (also undocumented) functionality that you can use a table instead of the 3-column trl matrix. That allows to mix different variable types (not only numbers ) and to better specify what the additional columns mean. Compared to the strict and limited definition of the event structure, this allows ft_definetrial to be conceptually more similar to how you would look at the events.tsv.

robertoostenveld commented 2 years ago

... we could even consider organizing an online hackathon (via zoom) to work on improved documentation for procedures and best practices together. For this it might be good to gather some more people to work on this (e.g. through the mailing list), as that would increase the diversity and result in a broader view on these challenges.

schoffelen commented 2 years ago

Some existing documentation can be harvested from https://www.fieldtriptoolbox.org/workshop/paris2019/handson_raw2erp/ and the other code and tutorials that were created for that workshop -> I see that Robert actually already pointed this out in one of his comments.

JojoVh commented 2 years ago

Dear Robert,

Thank you so much for your detailed reply and extensive resources. I very much appreciate the huge amount of work and effort you and the Fieldtrip team have done in implementing BIDS. I respect very much that there are more things in the pipeline and still under development.

I went over the resources, and I see that there are multiple places in which group analysis, high level processing and automation has been set up and explained.

From your point of view, I see there is

1 convert the source data to BIDS raw 2 do the analysis 3 convert the results to BIDS derivatives

I agree very much, that nr1 is very well documented on Fieldtrip, and I use it in our lab as first point. nr2 is also documented on the data level, and files can be read indeed with e.g. ft_read_json and ft_read_tsv. It works for me. nr3 is something I do myself, and it is beyond what I expect from Fieldtrip.

However, I think there is a critical part that for people as me, naive users, is missing: between 1 and 2, once you have your dataset in BIDS (which is the case), and you know how to do the analysis, there is a simple challenge: selecting the files and having the pathing right. Reading the BIDS structure itself. The power of the structure of BIDS is that places can be read.

I have here an example of SPM in how BIDS pathing is documented, as Fieldtrip and SPM are mutually dependent so to say it may be of interest:

https://en.wikibooks.org/wiki/SPM/BIDS

% Parse BIDS directory BIDS = spm_BIDS('/data/BIDS-examples/ds007');

% Make general queries about the dataset spm_BIDS(BIDS,'subjects')

% Get the NIfTI file for subject '05', run '02' and task 'stopsignalwithmanualresponse': spm_BIDS(BIDS,'data','sub','05','run','02','task','stopsignalwithmanualresponse','type','bold')

ans =

  '/data/ds007/sub-05/func/sub-05_task-stopsignalwithmanualresponse_run-02_bold.nii.gz'

or similar with mne_bids function bidspath bids_path = BIDSPath(subject='01', session='01', run='05', datatype='meg', root='./bids_dataset')

If I may contrast this to the resource of Madrid (https://www.fieldtriptoolbox.org/workshop/madrid2019/tutorial_cleaning/)

cfg = []; cfg.dataset = ['/madrid2019/tutorial_cleaning/single_subject_resting/' subj '_task-rest_run-3_eeg.vhdr'];

which is in my opinion not making use of the BIDS structure for reading in your file.

So, from my point of view, I think it would be an extremely valuable addition that perhaps does not require much development to write an example on how to elegantly select your file of interest when reading a BIDS dataset. Perhaps spm_BIDS could be added with an example on https://www.fieldtriptoolbox.org/getting_started/bids/#reading-data-from-a-bids-dataset website itself or the ft_read_tsv can be expanded with just the subject / session name for the electrode.tsv can be read. e.g. ft_read_tsv (subj, ses, 'electrodes') or ft_read_tsv (subj, ses, taks, run, 'channels') or ft_read_tsv (BIDSpath)

Thanks again for your interest in the feedback I have on this topic Best wishes Jonathan

robertoostenveld commented 2 years ago

Thanks, that helps to zoom in. Have you looked at bids-matlab? That was originally forked from spm_BIDS and allows "querying" a BIDS formatted dataset. Since BIDS data handling in MATLAB is a general challenge, I think that a general solution might be better than a toolbox-specific one.

Having said that, FieldTrip also includes some private functions in https://github.com/fieldtrip/fieldtrip/blob/master/fileio/private/ to find BIDS files. But the concept of "querying" for a file (or files) is more powerful, compared to merely constructing the filename from the known parts.

Implementing a function like ft_BIDS(...) to replace the concatenation of filename parts (either with [] or sprintf or fullfile) would be easy, but does not bring us that far. The strength of BIDS is that it standardizes and facilitates handling exceptions over subjects, similar to details_sub01.m. I think that needs more than constructing file names.

But considering only file name, what I am thinking of is

participants = ft_read_tsv(fullfile(bidsroot, 'participants.tsv'));
for p=1:size(participants,1)
  % here could be a filter on the participants, e.g. to select them on age, or handedness

  scans = ft_read_tsv(fullfile(bidsroot, ..., 'scans.tsv'));
  for s=1:size(scans,1)
    % here could be a filter on the scans, e.g. to select only the MEG datasets

    meg_json = ft_read_json(..., 'eeg.json')
    channels_tsv = ft_read_tsv(..., 'channels.tsv')
    coordsystem_json = ft_read_json(..., 'coordsystem.json')

    % and here it becomes FieldTrip specific again
    % where channels_tsv is used in preprocessing
    % and coordsystem_json in the anatomical pipeline for the MEG volume conductor 

  end
end

How would a ft_BIDS() function would have to look like to realize this, or would bids-matlab be able to do the trick?

robertoostenveld commented 2 years ago

that pseudo-code section maps onto the "analyze_single_subject" and "analyze_group" in https://github.com/robertoostenveld/Wakeman-and-Henson-2015 or in https://github.com/Donders-Institute/infant-cluster-effectsize

robertoostenveld commented 2 years ago

Oh, and note that in BIDS the participant.tsv and scans.tsv are both optional. The participants.tsv is commonly specified, but I have seen many datasets that do not include scans.tsv. I would expect some smart skeleton code like above to work with these nested for-loops unconditional on exceptions such as the participants.tsv missing.

schoffelen commented 2 years ago

See #1855

JojoVh commented 2 years ago

Dear Robert and Jan-Mathijs

Thank you so much for your feedback and the functions you described. Yes, something like bids-matlab is indeed a very general tool to query the data, thank you for pointing out. I did have different experience with e.g. mne and mne_bids with regard to a toolbox-integrated BIDS reading function, but perhaps that is a matter of taste.

Thank you again for your suggestions In case I have further questions or new ideas, I ll let you know

Best wishes Jonathan @ICNeuromodulate Charité

fieldtrip / fieldtrip

Examples on how to read and process BIDS data with fieldtrip for the website #2067