courtois-neuromod / ds_prep

All the scripts to prepare the Courtois-Neuromod dataset
2 stars 4 forks source link

convert and segment physio recordings #6

Open bpinsard opened 4 years ago

bpinsard commented 4 years ago

Create a script to convert and segment in functional runs the Acqknowledge files to an open format.

sangfrois commented 4 years ago

Hey @bpinsard I see you are setting up a nice prep repo. Let me show what I did during the last brainhack and modified troughout the last months to get some readable data. I think it answers to the 2 first ticks in your list.

Find a notebook that sums up my previous attempt to read files, detect runs and save them.

My most recent attempt can be seen through the PR I'm building on phys2bids' repo. I think we've managed to slice the recordings in a way that is efficient with the rest of their code. Basically, it's a utility that the user can call by giving a list of number of triggers per run. For a whole session, we'd have a .acq file and a .tsv.gz for each run . It might take some more time to finish it up as the outputs need to be all bids-compliant, but at least we have the basis.

Let me know whether it would be preferable to work on home-made script or focus on a .json heuristic file

sangfrois commented 4 years ago

If you allow me, I would propose a change in the to-do list :

bpinsard commented 4 years ago

The first and second point are addressed in #7

sangfrois commented 4 years ago

We still have to pull the old data right? what #7 is addressing is for future recordings, am I mistaken?

bpinsard commented 4 years ago

Yes, for now we can assume that the input data will be structured more-or-less as in /data/neuromod/biopac/, so you can adapt your pipeline to have this kind of inputs.

sangfrois commented 4 years ago

While i'm finishing up with phys2bids' utility, here's a script to convert and segment biopac recordings.

I didn't manage to set up proper git workflow, as I can't create a fork and don't have write access.

bpinsard commented 4 years ago

Great! I would suggest to save the runs in a binary format (eg. hdf5) rather than tsv.gz to improve storage space and IO time. A tsv(.gz) could be provided for timeseries downsampled to the fMRI TR.

You can't fork this repo? I will look at that.

sangfrois commented 4 years ago

I can't fork the repo, affirmative.

Otherwise : I thought you wanted BIDS compliant outputs ; .tsv.gz is the standard. I don't get why we should opt for hdf5, if in any case, we'll want to have a valid BIDS dataset...

It's not a difficult to change, but still why bother having 2 different data types ?

bpinsard commented 4 years ago

I just changed the organization config to allow forking, it was disabled. :/ I think that the only files that will be in the BIDS structure and that needs to validate will be the highly processed files at the fMRI sampling rate, like HRV per TR for instance. As it is downsampled storing in .tsv(.gz) is not a problem. The rest of the files will likely live in <bidsroot>/sourcedata and will not need to fuly comply with BIDS standard. Storing data at 5kHz in text format, even with compression, is highly inefficient, in particular when the file goes through a pipeline of multiple preprocessing steps.

sangfrois commented 4 years ago

I wasn't aware of the structure that was decided, thanks for the heads up. i'll work around something to write HDF5 and PR. Should I compare against master ?

In any case I will create a dev branch on my fork