mne-tools / mne-python

MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python
https://mne.tools
BSD 3-Clause "New" or "Revised" License
2.7k stars 1.31k forks source link

ENH: Adding ITAB file reader #12439

Open robbisg opened 8 months ago

robbisg commented 8 months ago

Describe the new feature or enhancement

We would like to add the ITAB raw file reader to MNE-Python. @vpizzella made a first draft of the reader and I am prepping it following your guidelines with tests and linting. I have a bunch of question regarding the management of testing data and tests: 1) I noticed that data is included in another repo, should I open a PR also in this repo? 2) Is there a size limit of the files? 3) Is it better a file that includes all features implemented (and tested) or it is better a set of files? 4) I used a matlab (fieldtrip) file to test whether the reader reads the same numbers, is it correct or is there any other strategy?

Describe your proposed implementation

A first draft of the implementation can be found here, but it needs linting and so on.

Describe possible alternatives

We followed other readers implementation, so it is similar to other readers and maybe optimal.

Additional context

No response

welcome[bot] commented 8 months ago

Hello! 👋 Thanks for opening your first issue here! ❤️ We will try to get back to you soon. 🚴

larsoner commented 8 months ago

I noticed that data is included in another repo, should I open a PR also in this repo? Is there a size limit of the files?

Yep, PR there first is great. "As small as possible" is the rule of thumb. In practice for most formats we get away with something < 1 MB. Basically ~1sec of data or less is hopefully enough. But if your acquisition system writes like 10s blocks or something, then you'd want 20s of data for example.

Is it better a file that includes all features implemented (and tested) or it is better a set of files?

I would say whatever uses the least disk space while providing proper code/functionality coverage. So for example a file with weird lowpass settings and UTF encoding, and another file with normal lowpass settings and latin-1 encoding would be fine and preferred over 4 files (normal lowpass, weird lowpass, utf8, latin-1).

I used a matlab (fieldtrip) file to test whether the reader reads the same numbers, is it correct or is there any other strategy?

Yes if you have a reference implementation you know/trust is correct then often what we do is have the native format plus a little .mat file containing the data read with FieldTrip or EEGLAB or whatever, then we assert_allclose(raw.get_data(), read_mat(...)['data']) (paraphrasing).