vferat / pycrostates

https://pycrostates.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
36 stars 11 forks source link

[BUG] Overlapping 'bad' annotations cause wrong BaseCluster._segment_raw() behaviour #78

Closed vferat closed 2 years ago

vferat commented 2 years ago

The reject_by_annotationsargument is not working properly in BaseCluster._segment_raw()

import os
import numpy as np
import mne

sample_data_folder = mne.datasets.sample.data_path()
sample_data_raw_file = os.path.join(sample_data_folder, 'MEG', 'sample',
                                    'sample_audvis_raw.fif')
raw = mne.io.read_raw_fif(sample_data_raw_file)
raw.set_annotations(None)

annotation_0 = mne.Annotations(onset=100, duration=100, description='bad')
annotation_1 =  mne.Annotations(onset=130, duration=30, description='bad')

raw_01 = raw.copy()
raw_01.set_annotations( annotation_0 + annotation_1) #overlapping 'bad' annotations

We apply the logic from _segment_raw() found here

onsets, ends = _annotations_starts_stops(raw_01, ["BAD"])
onsets = onsets.tolist() + [raw_01.get_data().shape[-1] - 1]
ends = [0] + ends.tolist()

for onset, end in zip(onsets, ends):
    print(end, onset)

output:

0 60061
120123 78080
96098 166799

The segments are completely messed up due to the overlapping 'bad' annotations. The expect behaviour would be :

0 60061 # start of recording - start of first 'bad' annotation
120123 166799 # end of first 'bad' annotation - end of recording

No sure if we should fix the in Pycrostates or if it should be fix in _annotations_starts_stops in MNE

mne.__version__
1.1.0
mscheltienne commented 2 years ago

Ok, yes I can see it here... we should retrieve the onsets and ends per annotation instead of once for all annotations.

mscheltienne commented 2 years ago

I'll check _annotations_starts_stops to figure out if it should be fixed in MNE.

vferat commented 2 years ago

Looks like we can simply use invert=True

onsets, ends = _annotations_starts_stops(raw_01, ["BAD"], invert=True)

for onset, end in zip(onsets, ends):
    print(onset, end)

output:

0 60061
120123 166800
mscheltienne commented 2 years ago

Haven't looked in detail to your example yet, but are you sure?

import numpy as np

from mne import Annotations, create_info
from mne.annotations import _annotations_starts_stops
from mne.io import RawArray

data = np.random.randn(1, 10)
info = create_info(["EEG 001"], 1, "eeg")
raw = RawArray(data, info)

onset = [1, 2]
durations = [7, 2]
annotations = Annotations(onset, durations, "bads")
raw.set_annotations(annotations)

onsets, ends = _annotations_starts_stops(raw, "bads")

I'm getting:

onsets
Out[22]: array([1, 2])

ends
Out[23]: array([8, 4])

which looks correct?

mscheltienne commented 2 years ago

It looks to me like https://github.com/vferat/pycrostates/blob/1cb8d0c27e48ce9e6aa0c57ea2773dc89bd22d42/pycrostates/cluster/_base.py#L814-L815

is the issue. But I'll have to dig in more to remember what it does and to figure out what is happening here.

mscheltienne commented 2 years ago

Look also at this method in raw, maybe it can be useful https://github.com/mne-tools/mne-python/blob/880e883c06184160c30d50da06803e67977ac366/mne/io/base.py#L431-L474

It's what is used by reject_by_annotations to create Epochs.