mne-tools / mne-connectivity

Connectivity algorithms that leverage the MNE-Python API.
https://mne.tools/mne-connectivity/dev/index.html
BSD 3-Clause "New" or "Revised" License
68 stars 34 forks source link

Preserve events from Epochs -> Connectivity #22

Closed adam2392 closed 2 years ago

adam2392 commented 3 years ago

Problem Statement

In MNE-Python, constructing windows of EEG/iEEG data over time is represented as Epochs. For example, one can use mne.make_fixed_length_epochs to create a sliding window of raw EEG data, which one can then compute spectral tfr, or e.g. a VAR model for each window.

Epochs though is a misnomer imo because it represents time, but can also be "discontiguous". For example, one might have a subject recording where they are moving a joystick. Time-locking 1 second after they move a joystick, one can construct an Epochs data structure of the EEG data recording aligned to this event of interest. However, these "trials" are now not necessarily spaced out with the same amount of time as in the above example.

In MNE-connectivity, I am currently implementing VAR models. The input data is currently structured as (n_epochs, n_signals, n_times). However, this can either be a sliding window of VAR models, or just epoched trials.

So there is really no way to differentiate EpochConnectivity vs TemporalConnectivity besides our semantic definitions of each of these: EpochConn is time-axis in the first dimension (possibly discontiguous over time), while TemporalConn is time-axis in the last dimension that is contiguous over time.

However, I assume we want to differentiate these two. So how do we go about doing so?

Possible Solution

At the connectivity method level:

At the data container level:

WDYT? @britta-wstnr @larsoner

larsoner commented 3 years ago

So there is really no way to differentiate EpochConnectivity vs TemporalConnectivity besides our semantic definitions of each of these: EpochConn is time-axis in the first dimension (possibly discontiguous over time), while TemporalConn is time-axis in the last dimension that is contiguous over time.

I think as long as we make clear how these "time" axes differ, it can be clear why these two classes exist / why and how they can be differentiated.

Perhaps a useful instructive case is to consider how one would go from an Epochs object of shape (n_epochs, n_channels, n_times) to either of these objects. If your measure collapses across the epochs dimension to estimate time-varying connectivity at a sample-by-sample level, you get a TemporalConnectivity out. If your measure collapses across the sample-by-sample time dimension to get epoch-level-varying connectivity, you get EpochConnectivity. And if collapses across both (like our spectral measures typically do), you get Connectivity.

adam2392 commented 3 years ago

I agree that we can be more explicit, but I feel like the issue derives from MNE having different meanings for Epochs. For example: Epochs in MNE typically means discontiguous trials of EEG recording data. However, if you want to make a sliding window of your data, you can just use mne.make_fixed_length_epochs and now feed that data in.

The problem is both datasets are now as you said (n_epochs, n_signals, n_times). Okay, if I interpret each epoch as a window of EEG data and want to fit the VAR model here, then each Epoch -> 1 VAR model.

If instead Epochs are trials of duration n_times = like 30 seconds, then maybe I want to fit a sliding window over the time domain here for each epoch, resulting in say 10 VAR models per epoch now.

larsoner commented 3 years ago

but I feel like the issue derives from MNE having different meanings for Epochs. For example: Epochs in MNE typically means discontiguous trials of EEG recording data.

Hmm... I don't think of these being different meanings / I don't think of Epochs as this definition. Epochs to me are just contiguous (in time) chunks of your data taken from somewhere in an otherwise continuous (raw) recording. Whether the epochs are taken from N consecutive windows, consecutive windows with some missing (which should almost always be the case for sliding-window approaches if you use peak-to-peak and annotation-based artifact rejection properly!), or non-consecutive / event-based windows. If framed this way, I don't see any inconsistency.

In other words, I wouldn't ever allow a EpochConnectivity<->TemporalConnectivity conversion because I don't think it really makes sense. I think if you want epoch-by-epoch "time resolution", in practice it probably won't end up evenly spaced (e.g., in the case of artifact-based rejection) so there is no point in thinking of this as "TemporalConnectivity". Also thinking about the timescales, where EpochConnectivity will be on the order of seconds and TemporalConnectivity on the order of ms makes this clearer.*

*(You in principle could use a sliding window that moves by one sample at a time for EpochConnectivity and then end up with something with "millisecond resolution", but the window size is going to smooth things to the point that you don't really have millisecond resolution anyway, so this seems like an edge case...)

Maybe a good question is: why are we thinking about doing this conversion? What is the use case? If you want to know how your connectivity varies as a function of experimental (raw) timescales, I would use EpochConnectivity (probably with some epochs dropped) which collpases across the time dimesion, and plot this as a function of epoch. If you want to know how connectivity varies as a function of short (millisecond) timescales, then I would use TemporalConnectivity which collaspes across epochs.

britta-wstnr commented 3 years ago

I agree with Eric. In my mind, EpochsConnectivity has no time axis whatsoever. I can see how Epochs 1 to n could represent a time evolution, but they don't have to (for example, they could all be from different participants), so it is really just observations. If people do a "beginning of experiment vs end of experiment" kind of contrast, they usually sort these epochs into different conditions anyway, so they would end up with 2 or more EpochsConnectivity objects.

adam2392 commented 3 years ago

Okay based on our discussions then, it seems let epochs mean whatever the user wants it to mean.

If they make an epochs 3D array and pass to time varying VAR model, then it returns an epochconnectivity container. Then they interpret that accordingly whether it's over sliding windows or separate trials.

A question that comes to mind tho then is: what is the best way to add time points to the epoch connectivity tho relative to the original recording? E.g. if you want to say interpret connectivity at certain events like a seizure.

britta-wstnr commented 3 years ago

I am not sure what you mean with "adding time points"? Do you mean changing the relative 0 point / "trigger" in the epochs prior to computing connectivity? Because EpochsConnectivity has no time axis, right?

adam2392 commented 3 years ago

So for my use case, I would use make_fixed_length_epochs and the var.py model to compute a square A matrix for each Epoch. This emulates a sliding window computation with a VAR model fitted to each window (i.e. epoch) of data.

Ideally, I would like to preserve some notion of the fact that each window came from some "time points".

britta-wstnr commented 3 years ago

Would that be something that should be saved in some sort of attribute/info structure? Lag etc.?

larsoner commented 3 years ago

Maybe preserve the events array from the original Epochs somehow? That way you get both the sample number and the associated event number (useful especially if you have multiple conditions)

adam2392 commented 3 years ago

So taking a look at this now then, that means that if one uses make_fixed_length_epochs to construct Epochs from a Raw object with sfreq=1000 with a window size of 500 milliseconds and step size of 250 milliseconds, we would have an epoch.events array as:

[[    0     0     1]
 [  250     0     1]
 [  500     0     1]
 [  750     0     1]
...

indicating, epoch-1 starts at sample 0. Then using epochs.times, we can see that it is a 500 ms window. Then epoch-2 starts at sample 250 and again has a 500 ms window.

How Does This Apply to Our API?

Currently our API accepts both numpy arrays and sometimes(?) Epochs objects. If the passed in data is Epochs, okay we extract events and the times. If not tho... this requires allowing connectivity functions to take in an extra "events" array as an argument.

Should I just add an "optional" events kwarg to then all the connectivity functions that operate on Epochs structures? If one is not averaging over epochs, then this events can be used to assign "time points" to each epoch, which indicates "connectivity over time/windows".

Let me know if this doesn't make sense?

britta-wstnr commented 3 years ago

I feel like this should probably be done automatically if applicable (i.e. not asking the user to pass an extra argument and support it for Epochs). If a user needs this extra info but has only a numpy array at hand, they could make their numpy array into an Epochs object before by calling EpochsArray and create_info?

adam2392 commented 3 years ago

So all Connectivity containers should preserve events from Epochs if they are passed in. This way, the user can make use of events that occurred at the raw EEG level. This should be able to be done in 1 PR I think.

Is there anything special that EpochsTFR, or Epochs, or AverageTFR does with events besides just storing them? I skimmed the MNE-Python code and didn't see anything. @larsoner

larsoner commented 2 years ago

The events just get copied and stored AFAIK

adam2392 commented 2 years ago

Related: https://github.com/mne-tools/mne-python/issues/9850

? How do we preserve events that were stored in "sample" time from original epochs/raw data structure for connectivities that vary over epochs, not time ?