mne-tools / mne-python

MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python
https://mne.tools
BSD 3-Clause "New" or "Revised" License
2.7k stars 1.31k forks source link

MNE usage with sleep data #5684

Closed agramfort closed 5 years ago

agramfort commented 5 years ago

let me share here some thoughts of how working with EEG sleep data could be simplified once we have the annotations API in place. Given let's say 2 files (one raw data and one annotations) it should be as easy as doing:

import mne
raw = mne.io.read_raw_edf('data.edf')
annotations = mne.read_annotations('sleep_annots.edf')
raw.set_annotations(annotations)
events, event_id = mne.events_from_annotations(raw)
epochs = mne.Epochs(raw, events, event_id, tmin=0., tmax=30., baseline=None)
epochs['N1']  # to get all epochs of N1 type etc.

cc @massich @Slasnista

it would be great to have this in place for the release. It's pretty much done but it needs some ironing.

massich commented 5 years ago

+1

On Thu, Nov 1, 2018, 09:31 Alexandre Gramfort notifications@github.com wrote:

let me share here some thoughts of how working with EEG sleep data could be simplified once we have the annotations API in place. Given let's say 2 files (one raw data and one annotations) it should be as easy as doing:

import mne raw = mne.io.read_raw_edf('data.edf') annotations = mne.read_annotations('sleep_annots.edf') raw.set_annotations(annotations) events, event_id = mne.events_from_annotations(raw) epochs = mne.Epochs(raw, events, event_id, tmin=0., tmax=30., baseline=None) epochs['N1'] # to get all epochs of N1 type etc.

cc @massich https://github.com/massich @Slasnista https://github.com/Slasnista

it would be great to have this in place for the release. It's pretty much done but it needs some ironing.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mne-tools/mne-python/issues/5684, or mute the thread https://github.com/notifications/unsubscribe-auth/AGt-41498qcr5vvmrAmkAN7BAY4zkyVzks5uqrFJgaJpZM4YF_Gc .

massich commented 5 years ago

Is there any sleeping data already in MNE-python?

dengemann commented 5 years ago

That's conceptually very close to what I did manually in a more complicated way a few years ago. I like it!

On Wed, Nov 7, 2018 at 11:03 AM Joan Massich notifications@github.com wrote:

Is there any sleeping data already in MNE-python?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mne-tools/mne-python/issues/5684#issuecomment-436570782, or mute the thread https://github.com/notifications/unsubscribe-auth/AB0filV4MjqcO_7QOfjxoEUegNv0jAhRks5usq_dgaJpZM4YF_Gc .

dengemann commented 5 years ago

Then you could easily loop over segments, recast a 30 seconds epoch into a RawArray, re-epoch, etc.

On Wed, Nov 7, 2018 at 1:22 PM Denis-Alexander Engemann < denis.engemann@gmail.com> wrote:

That's conceptually very close to what I did manually in a more complicated way a few years ago. I like it!

On Wed, Nov 7, 2018 at 11:03 AM Joan Massich notifications@github.com wrote:

Is there any sleeping data already in MNE-python?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mne-tools/mne-python/issues/5684#issuecomment-436570782, or mute the thread https://github.com/notifications/unsubscribe-auth/AB0filV4MjqcO_7QOfjxoEUegNv0jAhRks5usq_dgaJpZM4YF_Gc .

jona-sassenhagen commented 5 years ago

API wise it looks good.

Slasnista commented 5 years ago

Ok, I'll go over it and make you a proposal.

Just a question, can multiple process access to the same edf record at the same time ? It would be great to use it in a batch generator for deep learning.

agramfort commented 5 years ago

I have never tried. But let’s make it work before making it fast.

jasmainak commented 5 years ago

@agramfort I would argue that you shouldn't need:

events, event_id = mne.events_from_annotations(raw)

since the annotations already contain the duration. So specifying tmin and tmax will run into the danger of overshooting your duration of annotation.

massich commented 5 years ago

That's taken care in the set_annotations

On Thu, Nov 8, 2018, 21:24 Mainak Jas notifications@github.com wrote:

@agramfort https://github.com/agramfort I would argue that you shouldn't need:

events, event_id = mne.events_from_annotations(raw)

since the annotations already contain the duration. So specifying tmin and tmax will run into the danger of overshooting your duration of annotation.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mne-tools/mne-python/issues/5684#issuecomment-437143210, or mute the thread https://github.com/notifications/unsubscribe-auth/AGt-4414L4rd4FcBL54l4ETuseaGR5R5ks5utJMNgaJpZM4YF_Gc .

jasmainak commented 5 years ago

So, if I do:

epochs = mne.Epochs(raw, events, event_id, tmin=-2, tmax=raw.annot.duration.max() + 0.1, baseline=None)

I will get an error?

massich commented 5 years ago

Oh, sorry. I thought you referred to the tailing annotation. Not to the time when epoching. But I would not do that either.

I would use the onset from annotations for epoching and then decide about the time to window

On Thu, Nov 8, 2018, 22:19 Mainak Jas notifications@github.com wrote:

So, if I do:

epochs = mne.Epochs(raw, events, event_id, tmin=-2, tmax=raw.annot.duration.max() + 0.1, baseline=None)

I will get an error?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mne-tools/mne-python/issues/5684#issuecomment-437159816, or mute the thread https://github.com/notifications/unsubscribe-auth/AGt-4wGzRExhuK0ft9705O4Qcfc3KBveks5utJ_7gaJpZM4YF_Gc .

agramfort commented 5 years ago

@jasmainak I am not sure I get your concern here. Many formats use annotations to mark stim onsets like events. We just allow this with event and event_id. If the durations are not 0 or fixed for some annotations then we'll need to invent something new. But we'll need a good use case.

jasmainak commented 5 years ago

I'm concerned about the use case you shared in your code snippet above. The duration is not 0 there and the annotations are (probably) contiguous. That is to say, right after N1 ends, another sleep stage starts. If you allow the user to specify arbitrary tmin or tmax in the Epochs constructor, their epochs could contain two (or more) sleep stages instead of one.

I think adding a check inside the Epochs constructor that makes sure your epochs don't contain more than one annotation should be fine.

massich commented 5 years ago

I would say that this is experiment specific. Even if I have annotations of contiguous regions all with the same size, why should I not be able to get epochs from onset-0.1 to duration+0.1? That this might not make sense for most experiments. Maybe. But I don't see why it should not be possible.

On Fri, Nov 9, 2018 at 5:28 PM Mainak Jas notifications@github.com wrote:

I'm concerned about the use case you shared in your code snippet above. The duration is not 0 there and the annotations are (probably) contiguous. That is to say, right after N1 ends, another sleep stage starts. If you allow the user to specify arbitrary tmin or tmax in the Epochs constructor, their epochs could contain two (or more) sleep stages instead of one.

I think adding a check inside the Epochs constructor that makes sure your epochs don't contain more than one annotation should be fine.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mne-tools/mne-python/issues/5684#issuecomment-437414166, or mute the thread https://github.com/notifications/unsubscribe-auth/AGt-45aREyCiRBtbogeNB_cTzMnsbxi2ks5uta0WgaJpZM4YF_Gc .

jasmainak commented 5 years ago

If I'm trying to do a classification task between sleep stages, I don't want my data to be contaminated by other classes. I'm worried that people will unknowingly use the wrong tmax (e.g., by copy-paste from example) when they did not intend to.

Of course maybe people want to do onset - 0.1 to duration + 0.1 but I'm not aware of such use cases. Maybe there are valid reasons to do this too ...

Slasnista commented 5 years ago

Actually you might want to induce some shifts to enhance translation invariance of a neural net for instance, and allowing somebody to load a sample from onset - 0.1 to onset + duration + 0.1 is interesting.

Also, if you are working on event detection, e.g. spindles detection, you might have to work with events of variable durations. It might be great to load either the portion of signal corresponding to the event or a larger window of signal containing this event and potentially others to feed it into a detection algorithm.

agramfort commented 5 years ago

@jasmainak I don't share your concerns. It's your responsibility to know what tmax to use. If you do sleep staging you know that tmax is 30s. Also it's very possible to have annotations that overlap in time. You can have a spindle in a sleep stage and you may want both. But in your epochs you'll set event_id to exclude spindles if you only care about sleep stages.

Slasnista commented 5 years ago

Hi,

The following code raises an error on mass SS3

    raw = mne.io.read_raw_edf('Mass_SS3/01-03-0001 PSG.edf', preload=True)

    annot_1 = mne.read_annotations('Mass_SS3/01-03-0001 Annotations.edf')
    raw.set_annotations(annot_1)
    events, event_id = mne.events_from_annotations(raw)
    epochs = mne.Epochs(
        raw, events, event_id, tmin=0., tmax=30., baseline=None)

The error is the following:

RuntimeError                              Traceback (most recent call last)
<ipython-input-1-78bf10e62987> in <module>
     41     events, event_id = mne.events_from_annotations(raw)
     42     epochs = mne.Epochs(
---> 43         raw, events, event_id, tmin=0., tmax=30., baseline=None)

~/anaconda3/envs/lol3/lib/python3.7/site-packages/mne-0.17.dev0-py3.7.egg/mne/epochs.py in __init__(self, raw, events, event_id, tmin, tmax, baseline, picks, preload, reject, flat, proj, decim, reject_tmin, reject_tmax, detrend, on_missing, reject_by_annotation, metadata, verbose)

~/anaconda3/envs/lol3/lib/python3.7/site-packages/mne-0.17.dev0-py3.7.egg/mne/utils.py in verbose(function, *args, **kwargs)
    941         with use_log_level(verbose_level):
    942             return function(*args, **kwargs)
--> 943     return function(*args, **kwargs)
    944
    945

~/anaconda3/envs/lol3/lib/python3.7/site-packages/mne-0.17.dev0-py3.7.egg/mne/epochs.py in __init__(self, raw, events, event_id, tmin, tmax, baseline, picks, preload, reject, flat, proj, decim, reject_tmin, reject_tmax, detrend, on_missing, reject_by_annotation, metadata, verbose)
   2160             reject_tmax=reject_tmax, detrend=detrend,
   2161             proj=proj, on_missing=on_missing, preload_at_end=preload,
-> 2162             verbose=verbose)
   2163
   2164     @verbose

~/anaconda3/envs/lol3/lib/python3.7/site-packages/mne-0.17.dev0-py3.7.egg/mne/epochs.py in __init__(***failed resolving arguments***)
    316             events = events[selected]
    317             if len(np.unique(events[:, 0])) != len(events):
--> 318                 raise RuntimeError('Event time samples were not unique')
    319             n_events = len(events)
    320             if n_events > 1:

RuntimeError: Event time samples were not unique

which comes from the fact that some events have the same start times: e.g. a sleep stage has the same start time than an annotation of artefact.

-> Should we filter the events before using epoching or after ? -> if we filter events after epoching, could we allow multiple events to have the same start times ?

agramfort commented 5 years ago

how can we replicate?

Slasnista commented 5 years ago

yes the data are on our dropbox and the script is called check_mne_reader.py

massich commented 5 years ago

The error was due to fact that events_from_annotations was called without specifying event_id. Therefor it was None and all the events are selected.

events, event_id = mne.events_from_annotations(raw, event_id=None)

passing event_id solves the problem

my_event_id = { 'Sleep stage ?':1,
                'Sleep stage W':2,
                'Sleep stage 1':3,
                'Sleep stage 2':4,
                'Sleep stage 3':5,
                'Sleep stage R':6 }
events, event_id = mne.events_from_annotations(raw, event_id=my_event_id)
agramfort commented 5 years ago

great so we're good to go then on public sleep data?

massich commented 5 years ago

We should. @Slasnista said he would draft an example with public data. I'll try to integrate it to MNE

@bdyetton pointed us to some publicly available sleep data in https://github.com/mne-tools/mne-python/issues/4494#issuecomment-422087517

mmagnuski commented 5 years ago

Just a reminder that it won't be as easy with eeglab data: currently the eeglab event reader does more steps than just events, event_id = mne.events_from_annotations(raw) to be backward compatible (for example event names that contain digits are turned to integers).

massich commented 5 years ago

I think that this was addressed. Or available through event_id as a callable. Do you have an example that is not working?

mmagnuski commented 5 years ago

No, I'll check, I'd be happy to be wrong on this. :) What I meant is that before annotations one could just read the eeglab file and then use mne.find_events(), while currently you need to create a function that parses event names and pass it to events_from_annotations, right? But I'll check that later in hope that it is as easy as it was before. :)

mmagnuski commented 5 years ago

Currently the eeglab reader does this internally:

def _event_id_func(trigger, event_id, event_id_func, dropped):
    """Mimic old behavior to be used with events_from_annotations."""
    if event_id is not None and trigger in event_id:
        return event_id[trigger]
    if event_id_func == 'strip_to_integer':
        trigger_new = "".join([x for x in trigger if x.isdigit()])
        if trigger_new.isdigit():
            return int(trigger_new)
        else:
            dropped.append(trigger)
            return None
    elif event_id_func is not None:
        return event_id_func(trigger)

# ...
        # create event_ch from annotations
        annot = read_annotations(input_fname)
        self.set_annotations(annot)

        _check_boundary(annot, event_id)

        latencies = np.round(annot.onset * self.info['sfreq'])
        _check_latencies(latencies)

        dropped_desc = []  # use to collect dropped descriptions
        event_id_ = partial(_event_id_func,
                            event_id=event_id,
                            event_id_func=event_id_func,
                            dropped=dropped_desc)
        events, _ = events_from_annotations(self, event_id=event_id_)

and then many additional checks. So currently it does not seem possible to drop what eeglab reader does internally and unify with the future default of events_from_annotations. But sorry for littering this discussion - this is not specific to sleep data, but relates to whether eeglab file reading could be made easy with using events_from_annotations and still produce the same events as before. But this is a separate discussion.

agramfort commented 5 years ago

@mmagnuski we are aiming to avoid having some custom handling of annotations across our readers. What was (is still) done by eeglab raw reader should be deprecated although it should still be simple to do with the new unified Annotations API. As I don't use eeglab I don't have a clear vision. What are your thoughts on this? Can you live with the new API?

btw @massich did we already deprecate all the stim channel generation for eeg readers as planned? it should be done before the release otherwise it will delay the transition by 6 months. Do we all agree on this? cc @cbrnr @jona-sassenhagen

jona-sassenhagen commented 5 years ago

I don't use EEGLAB data myself these days, so I might not be the best advocate. But remember it's a secondary format - it is for data recorded in another format, and preprocessed in EEGLAB. So it might not be a top-priority format.

That said, deprecation is fine with me.

mmagnuski commented 5 years ago

@agramfort Let's continue this conversation in the issue regarding backwards compatibility of eeglab reading, ok? I didn't want to sidetrack the conversation here so much.

massich commented 5 years ago

So this should warn?

import os.path as op
import mne
edf_path = op.join(op.dirname(op.dirname(mne.__file__)), 'mne/io/edf/tests/data/test.edf')
raw_py = read_raw_edf(edf_path, misc=range(-4, 0), stim_channel=139, preload=True)
bphanikrishna commented 5 years ago

let me share here some thoughts of how working with EEG sleep data could be simplified once we have the annotations API in place. Given let's say 2 files (one raw data and one annotations) it should be as easy as doing:

import mne
raw = mne.io.read_raw_edf('data.edf')
annotations = mne.read_annotations('sleep_annots.edf')
raw.set_annotations(annotations)
events, event_id = mne.events_from_annotations(raw)
epochs = mne.Epochs(raw, events, event_id, tmin=0., tmax=30., baseline=None)
epochs['N1']  # to get all epochs of N1 type etc.

cc @massich @Slasnista

it would be great to have this in place for the release. It's pretty much done but it needs some ironing.

Myself KRISHNA B, I'm a student from India. My research is on EEG signals using a wireless sensor network to detect drowsiness of a person. I have questions about the details of the procedure described for extracting wakeful state and drowsiness stage from freely available physionet sleep EEG data. That is available at: https://physionet.org/physiobank/database/sleep-edfx/sleep-cassette/

I am stuck, and I want to get out of it. May I request you to spare some time to clarify my doubts. If you clarify my doubts, It would be a very big support for my research work. Any time of your convenience is okay for me.

agramfort commented 5 years ago

closed by https://github.com/mne-tools/mne-python/pull/5718