mne-tools / mne-python

MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python
https://mne.tools
BSD 3-Clause "New" or "Revised" License
2.66k stars 1.31k forks source link

Add steady-state response statistical tests #9045

Open rob-luke opened 3 years ago

rob-luke commented 3 years ago

Describe the new feature or enhancement

@jundurraga and I would like to add two different statistical tests which are used to determine the presence of steady-state evoked potentials. An F-test (Dobie 1996) and a two-sample Hotelling T2 (Hofmann 2012).

Recently in #8867 an example of analysis of visual steady state responses was added. But this is a largely qualitative analysis, we wish to add quantitative statistical measures which are widely used in the analysis of auditory steady state responses (ASSRs). These responses are useful for objectively evaluating auditory thresholds, assessing suprathreshold hearing, and monitoring the state of arousal during anesthesia (Picton 2003).

Describe your proposed implementation

In ASSRs, sounds are presented which are modulated at a known rate (e.g. 40 Hz). When analysing the M/EEG signal we wish to determine if the measured signal contains a consistent component at the frequency of interest, or if the signal at the frequency of interest is random. This is typically done using an F-test which looks at the power in the FFT bin of interest (signal) relative to a sample of neighbouring FFT bins (noise). So in the example above, you compare the power in the FFT bin corresponding to 40 Hz with, for example, the 30 bins above and below this bin (this may correspond to +- 1.5Hz, depends on the length of epochs and sample rate). Whereas in the HT2 test the noise estimate is based on the variation within the single FFT bin of interest across epochs (rather than the adjacent bins) and looks at both the consistency of both the real and imaginary component of the FFT. These tests have been widely validated and in our tutorial we can demonstrate that they give appropriate false positive rates.

Does this seem appropriate for adding to MNE? We will also create a new tutorial as the new tagging example is more about comparing two different conditions, whereas ASSRs are usually measured in continuous ~5 min blocks. We can easily contribute a small example data file too.

Describe possible alternatives

Jaime already has open source python code for these tests. I cant link to them now because gitlab is down. But we can do when its back up.

API: Function names

I was thinking to put the code in mne/stats/steadystateresponses.py

I have looked at the parametric stats section and the names f_... are already used. So to differentiate from these existing stats methods I was thinking something like

Or would it be better to be more explicit with something like steadystateresponse_f_test().

API: Function returns

This function will return lots of information including signal amplitude, signal phase, noise amplitude, noise phase, test statistic, p value, etc.

In my personal code I return a data frame. As its common to analyse many different frequencies you can easily then append the output of many functions. However, pandas is an optional dependency for MNE. Are there other statistical functions in MNE that return quite a bit of information and how do they do it? A dict? Then its easy to convert to a dataframe and loop.

Additional comments

Jaime have I missed anything?

If this work seems appropriate for MNE then we can open a pull request. The code here should be relatively easy as we both have several implementations of these tests already, I think the tricky part will be deciding a nice API that fits with existing MNE infrastructure.

rob-luke commented 3 years ago

@larsoner @drammock @agramfort @dominikwelke @hoechenberger

larsoner commented 3 years ago

Does this seem appropriate for adding to MNE? We will also create a new tutorial as the new tagging example is more about comparing two different conditions, whereas ASSRs are usually measured in continuous ~5 min blocks. We can easily contribute a small example data file too.

To start how about (can be in one PR):

  1. Implement the Hotelling T2 in mne/stats/parametric.py
  2. Add to the SSVEP example the code to do the stats using ANOVA and Hotelling

This would help us understand if functions like ssr_ht2_test end up being necessary. If it's only a couple of lines of NumPy to get the correct entries from the rfft output and input them to mne.stats.hotelling_t2, and it sounds like there are a lot of options (e.g., how many bins to use, etc.) that people probably want explicit control over, it might make more sense to have people do that on their own. Or maybe it would make sense to take care of #7671 first, and then having the Spectrum class + stat functions it will be clear how to make the analysis easy for people.

drammock commented 3 years ago

Or maybe it would make sense to take care of #7671 first, and then having the Spectrum class + stat functions it will be clear how to make the analysis easy for people

I'd guess that #7671 won't happen for several months at least. I'd be OK with adding this functionality now and refactoring later once the spectum class exists.

jundurraga commented 3 years ago

@rob-luke that sounds fine rob, perhaps worth adding phase-locking value (PLV) too. @larsoner, @drammock I have implemented these functionalities using a discrete FFT algorithm (Goertzel). Doing so speed up the calculations and memory usage to allow computations at the source level, for example. For the ht2 and phase-locking values, one needs to pass the frequencies of interest and alpha value. For the F-test, it is also necessary to pass the frequency range of the neighbours' frequencies. Perhaps I would not use a spectral class for this, but rather a more general "statistical class" in which you can keep the test value (e.g. F-Value, HT2 value, PLV) as well as their correspondent degrees of freedoms, and p-values, and p-critical values. The attached figure shows an example of significant phase-locking (0 to 1) values estimated at the source level for an ASSR response. fsaverage_dSPM_B1_data_fsaverage_2_48_dSPM_auto_events_False_45_plv_rh

rob-luke commented 3 years ago

Add to the SSVEP example the code to do the stats using ANOVA and Hotelling

Hi @larsoner, Jaime and I just spoke on the phone and we are unable to understand what you mean by this. Could you please elaborate?

larsoner commented 3 years ago

I'm just suggesting that we implement the parametric tests f_test and hotelling_t2_test in mne/stats/parametric.py and show how to use them directly in some example, presumably the SSVEP one. We can show ttest_rel vs f_test vs hotelling_t2_test for example.

Then if it's clear that we can make a useful API for people from that code we can add the steady-state-specific ones like ssr_ht2_test after that.

rob-luke commented 3 years ago

Ah I get you now, that makes sense.

dominikwelke commented 3 years ago

hi @rob-luke @jundurraga , sorry somehow i have missed this!

i agree that having these tests easily at hand in MNE might be practical and facilitate work with steady state data. @kalenkovich and i also planned putting together some dedicated code for frequency-tagging / steady-state analyses, so I'll definitely follow this from now on :)

API: Function names

I was thinking to put the code in mne/stats/steadystateresponses.py

I have looked at the parametric stats section and the names f_... are already used. So to differentiate from these existing stats methods I was thinking something like

ssr_f_test: Steady-state response f-test. ssr_ht2_test: Steady-state response two-sample hotelling T2 test.

Or would it be better to be more explicit with something like steadystateresponse_f_test().

just a comment on the proposed API so far:

as you have already realized when reviewing our tutorial there is a broad range of names and keywords for different research that applies more or less identical analyses pipelines (assr, ssvep, vssp, fvps, freqeuncy-tagging, some people talk about entrainment and so on)

steady state [responses] is a term that's often used regardless of the sensory modality (audio/visual) but it's loaded with quite some assumptions about the investigated neural processes.
however, that the measured signal component (power at a given frequency) is steady does not neccessarily imply that the neural response is "steady" - whatever this would mean. in the visual domain whats called steady state can also be conceptualized as a series of evoked responses - if the stim frequency is fix, you will see VEPs in the PSD. ..and also in the auditory domain one can investigate processes that are clearly no steady-state responses, e.g. here https://doi.org/10.1038/nn.4186

f-test and hotelling t^2 might be more common in the assr literature than in e.g. vssr, but if this commit is the initialization of a broader MNE API for these types of analyses we might want to be more general purpose:

using frequency-tagging in the API would be one option (e.g. ftag_ht2 or freqtag_ht2). it reflects more of the conceptual/methodological side of whats happening and should hence concisely cover more use cases.

just a suggestion, open for discussion :) (btw, this notion originates from Bruno Rossion but i find it convincing)

rob-luke commented 3 years ago

Thanks for the input @dominikwelke.

the neural response is "steady" - whatever this would mean.

It means the response is continuous and does not vary over time (see 1). As such, you can use methods with these assumptions to estimate the neural response, such as averaging and FFTs. As you say, the term is widely used across both auditory and visual fields. There are many publications that use this term and books on the topic (see 2,3). Whereas your suggestion of frequency tagging is newer and less widely used (I checked that on google scholar).

it reflects more of the conceptual/methodological side of whats happening and should hence concisely cover more use cases.

I disagree with this. Tagging somewhat vaguely describes a method (can you please point me to a concise definition of the term?). Whereas the traditional term describes the neural response, which is what we actually want to analyse and quantify (irrespective of the methodology used to evoke it). I guess the terms are complimentary, you use a frequency tagging methodology to analyse a steady-state neural response. However, could you have a steady state response without the use of a frequency tagging methodology? I guess you could, and then you would still wish to analyse it with these methods.

Obviously there are arguments for and against each terminology. However, I don’t think that this is the place to try and change the accepted terminology that is widely used in both visual and auditory fields. So we should stick with steady state as that’s most common and widely accepted in the field.

Van Eeckhoutte, Maaike, Robert Luke, Jan Wouters, and Tom Francart. "Stability of auditory steady state responses over time." Ear and Hearing 39, no. 2 (2018): 260-268.

Rance, G (2008). The auditory steady state response. Plural Publishing

Picton, T. W. (2010). Human auditory evoked potentials. Plural Publishing.

dominikwelke commented 3 years ago

as i said, just wanted to throw an alternative into the discussion :)

no intention to rant about any of the used terminology, they were all coined with good reasons.

It means the response is continuous and does not vary over time (see 1).

yes, we need to assume the response steady or stationary such that we can do an fft over 30 s or more and interpret the result, but this does not necessarily mean that what the brain does is something steady (this is what i meant with the quotation marks).. it dynamically tracks a dynamic stimulus stream, and given a specific structure/design of the stimulusstream the response can be analysed with the tools we discuss

steady state is most broadly used and everybody will know what it means and what to do with it. just as i said in my (and some other's) reading, the term "frequency tagging" would be a common denominator that is more general. i am quite new in this field, and i do come from a more applied background though..

rob-luke commented 3 years ago

Is there a journal article primer on frequency tagging that you can recommend that discusses all this?

But as you say, SSR is “most broadly used and everybody will know what it means and what to do with it”. We will put in the docs all the alternative names, but to allow the greatest number of users to easily find and understand the code then the most widely known terminology should be used.

As a use case: I’ve been doing these analyses for many years across several labs and I wouldn’t have understood what freqtag is.

rob-luke commented 3 years ago

(I'm on a road trip, so too much time to think about this 😃)

So I think my problem with frequency tagging is the word "tagging" and the explicit implication of a stimulus. Whereas, I want to be able to study any neural signal that is constant in frequency, not necessarily locked to a tagging stimulus. For example, I have tinnitus that is a constant tone, I would like to examine if that is reflected in the eeg signal, here the measured signal may follow a perceived sound but there is no tagging occurring. Further, we often analyse resting state data with these analysis techniques, which makes even less sense if the word tagging is used. Hence why I vote for terminology that is about the neural signal and not about the stimulus/methodology.

Another terminology that is used (less so than steady state response, but more so than frequency tagging) is frequency following response. If I was making up my own terms I might use fixed frequency signal, but then absolutely no one would find our code 😉.