lina-usc / pylossless

🧠 EEG Processing pipeline that annotates continuous data
https://pylossless.readthedocs.io/en/latest/
MIT License
20 stars 8 forks source link

adding format parameter to save function #159

Open drpedapati opened 2 months ago

drpedapati commented 2 months ago

This pull request adds a format and 'event_id' parameter to the pipeline.save() method in pipeline.py. A new pipeline method gets event_ids from existing event codes and annotations.

Changes Made

Motivation

When working with non-EDF formats, such as BrainVision, the current implementation leads to errors. This change allows for greater flexibility in output formats, improving the library's usability across different EEG data types. The event_id field perserves existing event codes, otherwise arbritary codes are assigned.

Potential error messages these changes address:

ValueError: The provided raw data contains annotations, but "event_id" does not contain entries for all annotation descriptions. The following entries are missing: BAD_LL_noisy, BAD_LL_noisy_ICs

scott-huberty commented 2 months ago

Thanks for the contribution! And sorry for the belated reply. I will try to take a look this week. This is the first time that the CI's have run in a while so I suspect that there will be CI failures that are unrelated to this PR.

scott-huberty commented 1 month ago

I haven't forgot about this, you just caught me in a busy week last week. Will try again to get to it this week.

drpedapati commented 1 month ago

Hi Scott,

Same here, we are on deadline for a grant, so I will be making code changes and pushing when possible, will circle back. No worries at all, great project you guys have. Would you have some interest in closer collaboration in terms of helping optimize pipeline parameters for neurodevelopmental studies? We have over 5000 EEGs for autism, Fragile X, TSC etc. and happy to share authorship or joint grant applications.

We are located at Cincinnati Children’s and Makoto from the EEGLAB team joined our lab last year. I have Peyton Siekierski who is our MD/PHD student.

Ernie

From: Scott Huberty @.> Date: Monday, July 15, 2024 at 4:23 PM To: lina-usc/pylossless @.> Cc: Pedapati, Ernie @.>, Author @.> Subject: Re: [lina-usc/pylossless] adding format parameter to save function (PR #159)

This email originated from an EXTERNAL sender to CCHMC. Proceed with caution when replying, opening attachments, or clicking links in this message.

I haven't forgot about this, you just caught me in a busy week last week. Will try again to get to it this week.

— Reply to this email directly, view it on GitHubhttps://github.com/lina-usc/pylossless/pull/159#issuecomment-2229334334, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AELVSDNJP4UY5JXDYOE4SBTZMQVTRAVCNFSM6AAAAABKPSKVISVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRZGMZTIMZTGQ. You are receiving this because you authored the thread.Message ID: @.***>

scott-huberty commented 1 month ago

Would you have some interest in closer collaboration in terms of helping optimize pipeline parameters for neurodevelopmental studies? We have over 5000 EEGs for autism, Fragile X, TSC etc. and happy to share authorship or joint grant applications.

Nice! My first research job was on a TSC study.

I'm happy to improve this pipeline in whatever ways will meet people's needs. This pipeline was initially developed for our neurodevelopmental EEG study so your use-case certainly sounds like one we'd want to support.

Regarding closer collaboration/papers/grants, this opportunity sounds interesting to me, but I want to give @christian-oreilly, the PI of this project and other core dev, a chance to weigh in (he is Out-of-Office until late July and so may be delayed in responding).

drpedapati commented 1 month ago

Hi Scott,

I love the details – still wrapped up in grant prep, but will circle back shortly. Ernie

scott-huberty commented 1 month ago

@drpedapati no worries Ernie, no rush. In the mean time I may merge main into this branch so that we can make the CI's happy, and I might add a tiny test.

christian-oreilly commented 1 month ago

Would you have some interest in closer collaboration in terms of helping optimize pipeline parameters for neurodevelopmental studies? We have over 5000 EEGs for autism, Fragile X, TSC etc. and happy to share authorship or joint grant applications.

Thanks for contributing this PR, @drpedapati, and sorry for the late response due to vacation. As mentioned by @scott-huberty, we welcome input from the community and will make our best effort to integrate it so that this pipeline can best serve the needs of the community. @scott-huberty already answered this comment for his side; for mine, I can add that we (my lab at the University of South Carolina) are always happy to collaborate on topics related to EEG in ASD and neurodevelopment. This is a significant axis of our research, in collaboration with the broader Carolina Autism and Neurodevelopment Research Center. Do you have specific objectives in mind where you think our involvement could be beneficial? I am happy to discuss these at your convenience.

  1. Allow the user to save their EEG data in a format other than EDF, to a format is already supported by mne-bids (i.e. brain vision, fif, eeglab`). I agree that we should support this. We initially forced the .edf format for historical reasons (the MATLAB lossless pipeline did this).

I think there is a bit more than historical reasons and backward compatibility; PyLossless was designed to work with BIDS. If I remember well, BIDS (or the MNE support of BIDS) was oriented toward the EDF format. It now supports more formats (including BrainVision: https://bids-specification.readthedocs.io/en/stable/modality-specific-files/electroencephalography.html) and I think we should aim to support all formats supported by BIDS with as little format-specific code as possible (i.e., having external libraries such as mne-bids dealing with most of BIDS support directly).

Since a user no long has to save their data to edf, then maybe we should make edifo an optional dependency?

Agreed... but let's make sure that this is transparent to the user (e.g., having a very descriptive error message on how to pip install edfio if someone tries to output with EDF and does not install this optional dependency). I am wondering if we could install it on-the-fly (e.g., https://stackoverflow.com/a/55188705/1825043 ) when necessary. I am not sure if this is seen as a bad practice, however...

@drpedapati I assume you answered to this issue directly through email? If not too inconvenient, it is probably better to click on the "view on GitHub" link (or something to that effect) in the email and respond directly on GitHub. I am suggesting this because your answers currently copy all the history of the discussion which clutters the ticket and makes it more cumbersome to follow. I edited your previous answer to remove all this duplication. I made a local copy of this duplicated text in case you embedded some relevant info in this duplicated response that I missed. Just let me know if this is the case and I'll restore this duplicated text.