Closed marcelzwiers closed 3 months ago
Can you search for Open Source (and GPL compatible?) Python code to read and write the various file formats?
MNE-python?
Should we have bidscrambler_xxx where xxx is the file format? The reason for asking is that there is not a single file format (as with nifti) in use with EEG and MEG, but a few each.
I think having one scrambler_eeg
/meg.py
function is fine. The different file formats can then be handled within that function. But if it becomes big, we can always split things. The user won't know anyhow, because the CLI is all handled by scrambler.py
I think we should not approach it from the general EEG/MEG perspective, but rather from the file format perspective. A BIDS dataset is a structured list of files with well-defined file names; when scrambling them you need to know which files to read/modify/write. Splitting the logic of determining which files to read/write from the content-wise scrambling provides a strategy to also implement this for files that are used by the other BIDS modalities (PET, NIRS, iEEG, microscopy, motion), even though we are not neccessarily domain experts on all of these ourselves.
Could you start by extending the scrambler such that it reads and writes (i.e., copies) a .fif
file? The fif naming scheme is well documented in the MEG specification. That would allow testing on use case 2.3. For now you don't have to worry what is contained in the fif file, but the Python code that you write should have a placeholder for reading the (binary) content and for writing the content. In between the reading and writing, the content would be modified; that is something @schoffelen or I can implement.
Once we have it for .fif
files (one of the MEG formats), we can take the next step and implement it for one of the EEG formats so that we can also continue with use case 2.4.
Earlier, I have already pushed some placeholder code that for now should be capable of dealing with fif-files of the 'raw' and 'average' type (for the average: provided it contains just a single condition average, still need to figure out how to generically detect and deal with multiple averages). The scrambling performed for now (I think) is a scrambling across channels (not really meaningful but good enough for now). I haven't tested this yet on a full bids directory, but bids-ified a single test dataset I had lying around locally, and this seemed to work. In order to efficiently contribute to this I'd first need to familiarize me a bit more with how to efficiently develop code and test interactively using pycharm, and to get used to the state-of-the art testing framework implemented by @marcelzwiers
Note:
@schoffelen could you add a test_scramble_fif
to the https://github.com/SIESTA-eu/wp15/tree/main/BIDScramble/tests and document its use in https://github.com/SIESTA-eu/wp15/blob/main/usecase-2.3/README.md#scrambled-data ?
I tested it for use case 2.3 and it seems to work.
I don't know the valid options for scrambling, so that still needs some attention. @marcelzwiers presenting the BIDScramble in general would be useful for that.
Also test_scramble_fif
should be added to the tests.
A "null" and "permute" scrambler are now both in place, and the fif scrambling is part of the test suite.
This should be good enough for now, time to move on with the next scrambler for EEG data, see #28
too bad, rechts ingehaald. I will flush my attempts then.
oh, ik dacht dat jij vanmiddag andere zaken te doen had ;-)
ja, klopt, maar na 5 uur niet meer, en ik was al een beetje begonnen... and with my limited python skills I am moving a bit more slowly than the rest
The pytest data is rather big (900MB or so), it is desirable if there was a smaller file to work with. Also the isevoked
codepath is not tested
I have already created bidscramblers for tsv, json and nifti files, but I feel that I don't have the expertise to write a scrambler for eeg and meeg data.