FAQ: Regularizing the noise covariance matrix

christianbrodbeck commented 9 years ago

re: https://github.com/mne-tools/mne-python/issues/1495#issuecomment-57097141 I have a first question (I will start a FAQ with it as @agramfort suggested): How should I regularize my noise covariance matrix?

The manual says "Since finite amount of data is usually available... " does that mean after having a certain amount of data regularization is no longer required?

What are recommended values for regularization? I noticed that the example does not use the function's default values. What do these values depend on?

If regularization is something that should always be done, should the function be more prominent? E.g., a Covariance method? Or even an argument of make_inverse_operator()?

christianbrodbeck commented 9 years ago

Related to this, is there a principled way to select the SNR value? People seem to use the default 3 for analyzing averages and I recall recommendations of 2 for single trials estimates and 1 for analyzing raw data, but would it be advantageous to estimate the SNR from the data?

dengemann commented 9 years ago

The manual says "Since finite amount of data is usually available... " does that mean after having a certain amount of data regularization is no longer required?

In my experience after about 50 epochs a cov based on 200ms baseline segments will not be very different from a regularized cov. Source are amplitudes still to some extent correlated with the number of epochs used for cov estimation. When you optimally regularize your cov this is not the case.

What are recommended values for regularization? I noticed that the example does not use the function's default values. What do these values depend on?

Difficult to say, these can be many things, sensor type, environmental noise at site, vendor. In my there isn't really one regularization value that always works. I would recommend to go with defaults and look at whitened butterfly plots as here: http://martinos.org/mne/stable/auto_examples/plot_evoked_whitening.html#example-plot-evoked-whitening-py

You want to see that most baseline signals are between -1.96 and 1.96 assuming a multivariate Gaussian. Second you want to see that signals return to baseline after the evoked response. In addition you can compute a global field power using a chi squared statistic across channels. Under the same assumptions you would expect the baseline signal to have a value around 1, not above, not below.

Related to this, is there a principled way to select the SNR value?

Good question.

agramfort commented 9 years ago

we have an issue to computer SNR of data like in mne_analyze if you want to take a stab at it :)

dengemann commented 9 years ago

@agramfort how would you deal with the circularity of using an SNR estimate depending on the inverse solution to parametrize the SNR of the inverse solution?

agramfort commented 9 years ago

it is not circular. The noise is estimated from baseline.

christianbrodbeck commented 9 years ago

Thanks @dengemann ! So would you recommend to evaluate a covariance matrix, e.g. using the global field power, and then based on that regularize it or not?

@agramfort how do you compute the SNR? The manual says "The computation of the apparent SNR will be explained in future revisions of this manual" (http://martinos.org/mne/stable/manual/analyze.html#cacjffee)

dengemann commented 9 years ago

So would you recommend to evaluate a covariance matrix, e.g. using the global field power, and then based on that regularize it or not?

Yes. When using GFP don't forget to take into account the true degrees of freedom. E.g. `n - 3 if you have 2 SSP vectors active.

agramfort commented 9 years ago

@agramfort how do you compute the SNR? The manual says "The computation of the apparent SNR will be explained in future revisions of this manual" (http://martinos.org/mne/stable/manual/analyze.html#cacjffee)

:) I guess you have to read the C code :-/ or ask @mshamalainen to update the doc...

christianbrodbeck commented 9 years ago

@dengemann did I understand this correctly: because the whitened values should be normally distributed, the global field power is expected to be 1? Am I missing something? How/where do you use a chi square statistic?

agramfort commented 9 years ago

@dengemann did I understand this correctly: because the whitened values should be normally distributed, the global field power is expected to be 1? Am I missing something? How/where do you use a chi square statistic?

because you sum the squares of normally distributed random variables. But yes the expected value should be 1.

christianbrodbeck commented 9 years ago

So for using the Chi Square statistic I would use the sum squares of the sensors instead of the GFP, right?

christianbrodbeck commented 9 years ago

Also, does that mean I have a problem if this is my data? [edit: in my original figure Evokeds were not baseline corrected – now they are, still looks larger than 1 though, and regularization does not seem to change much]

For each subject I got an Evoked averaging about 150 trials, then calculated the GFP like:

    we = mne.whiten_evoked(evoked, cov, picks, True)
    gfp = we.data.std(0, ddof=1)  # (no active projections)
    plt.plot(we.times, gfp, color='b')

In the plot, blue lines are the GFPs with raw covariance matrix for each subject, red lines are GFPs with regularized covariance matrix.

figure_1

agramfort commented 9 years ago

it should be 1 during baseline. It's almost 2 on your data. how did you compute the noise cov?

christianbrodbeck commented 9 years ago

Using the same trials as for the evoked, i.e. 150 or so per subject:

epochs = mne.Epochs(raw, events, None, tmin=-0.1, tmax=0, baseline=(None, 0))
cov = mne.compute_covariance(epochs)

and then for the red lines

cov = mne.cov.regularize(cov, raw.info, mag=0.1, grad=0.1, eeg=0.1)

raw are band pass filtered 1-40 Herz, sampled at 1000 Hz

dengemann commented 9 years ago

@christianbrodbeck just to make sure, you're computing the GFP on the whitened evoked, right?

dengemann commented 9 years ago

Here's my GFP line of code:

np.sum(evoked.data ** 2, axis=0) / rank

christianbrodbeck commented 9 years ago

@dengemann so you're looking at the variance? I thought the GFP is defined as the standard deviation. But the root of 1 should still be 1. I posted my code for whitening in the post above, is it not right?

agramfort commented 9 years ago

hum... can you share the -epo.fif file?

christianbrodbeck commented 9 years ago

Here's one of them: https://www.dropbox.com/s/gji1j3vaw5qybq6/e3_r26-epo.fif?dl=0

with GFP

ep = mne.read_epochs('e3_r26-epo.fif')
cov = mne.compute_covariance(ep, tmax=0)
wep = mne.whiten_evoked(ep.average(), cov, range(len(cov.ch_names)), True)
gfp = wep.data.std(0)

gfp

agramfort commented 9 years ago

your gfp is too high which suggests you underestimate the noise variance. Can you try removing baseline. I see that you high passed at 1Hz anyway.

christianbrodbeck commented 9 years ago

Can you try removing baseline

@agramfort How? The Epochs are baseline corrected, is that what you mean? Or do you mean try without baseline correction?

agramfort commented 9 years ago

yes try without baseline. With a high pass at 1Hz extra baseline should not be mandatory.

christianbrodbeck commented 9 years ago

hmm https://www.dropbox.com/s/r964a5y2bwv6ulf/e3_r26_nobl-epo.fif?dl=0 This is based on the same code as in the last post, neither baseline correction for the covariance matrix nor for the GFP. Still looks too high. Is my Data just noisy? Or would that not explain the misestimated variance? gfp

agramfort commented 9 years ago

here we go

import numpy as np
import mne

# ep = mne.read_epochs('e3_r26-epo.fif')
ep = mne.read_epochs('e3_r26_nobl-epo.fif')
cov = mne.compute_covariance(ep, tmax=0.)
cov = mne.cov.regularize(cov, ep.info, mag=0.05)
wep = mne.whiten_evoked(ep.average(), cov, range(len(cov.ch_names)),
diag=False)
gfp = wep.data.std(0)

print np.mean(gfp[ep.times < 0])  # should be close to 1

import matplotlib.pyplot as plt
plt.close('all')
plt.plot(ep.times, gfp)
plt.show()

the trick was to use diag=False + regularize a little bit.

christianbrodbeck commented 9 years ago

Ok Thanks! So it seems like the crucial change was whitening with diag=False. is that the correct way to whiten for this test? (As opposed to diag=True.) Or are both ways valid in different contexts?

christianbrodbeck commented 9 years ago

Also, if I now start writing a FAQ entry for this question (as discussed in https://github.com/mne-tools/mne-python/issues/1495) at what level should I add the FAQ page? As a subsection under MNE with Python?

agramfort commented 9 years ago

diag=True is what is used by inverse methods so you're good.

for the FAQ page depending on format/content I would even put it in banner

christianbrodbeck commented 9 years ago

Is it preferable to keep the regularization parameter constant across subjects, or would it be preferable to bring the baseline to a value as close to 1 as possible, for each subject separately? Based on the paper (@dengemann) I would suspect the latter? To find a good parameter, would you recommend e.g. doing the GFP test for each subject with regularizations in np.arange(0, 0.2, 0.01) and pick the one that brings the GFP test closest to 1?

agramfort commented 9 years ago

the best option is the later but the really good option is to make our automatic covariance estimation code public.... @dengemann?

dengemann commented 9 years ago

Could someone put pressure on the reviewers please?

On 19 Nov 2014, at 08:27, Alexandre Gramfort notifications@github.com wrote:

the best option is the later but the really good option is to make our automatic covariance estimation code public.... @dengemann? — Reply to this email directly or view it on GitHub.

agramfort commented 9 years ago

maybe they watch the repo on github...

dengemann commented 9 years ago

I hope so.

2014-11-19 9:24 GMT+01:00 Alexandre Gramfort notifications@github.com:

maybe they watch the repo on github...

— Reply to this email directly or view it on GitHub https://github.com/mne-tools/mne-python/issues/1637#issuecomment-63606323 .

christianbrodbeck commented 9 years ago

Hm, one more consideration: if my baseline has an evoked response to a previous stimulus, then I would not expect the whitened evoked GFP to be 1 but rather bigger than 1, right?

agramfort commented 9 years ago

Hm, one more consideration: if my baseline has an evoked response to a previous stimulus, then I would not expect the whitened evoked GFP to be 1 but rather bigger than 1, right?

it should still be one on average. If you have an evoked component in baseline you're tell the method to treat it as noise and hence remove it from post baseline data.

christianbrodbeck commented 9 years ago

it should still be one on average. If you have an evoked component in baseline you're tell the method to treat it as noise and hence remove it from post baseline data.

Even when calling compute_covariance with keep_sample_mean=False?

agramfort commented 9 years ago

indeed keep_sample_mean=False aims to address this pb. Then you should check that gfp is 1 after subtracting the evoked.

christianbrodbeck commented 9 years ago

indeed keep_sample_mean=False aims to address this pb. Then you should check that gfp is 1 after subtracting the evoked.

Is there a whiten_epochs function somewhere for this? After subtracting the evoked the evoked is 0...

agramfort commented 9 years ago

good point. In this case you need to use left out data to evaluate gfp for example by looking a new epochs of the last time instants of your epochs assuming the brain is back to baseline...

thanks for brining up this relevant usecase

teonbrooks commented 9 years ago

hey. is there a recommendation for the number of epochs one should use to calculate the covariance matrix? I have an experiment where there were around 1000 trials and in the past I use baseline of all of the trials. I've tried to do this with the 'auto' parameter and this process did not complete (I waited an hr and it was still iterating through the Factor Analysis).

christianbrodbeck commented 9 years ago

[edit: haha sorry thought that was continuing a private conversation]

On Feb 6, 2015, at 10:56 AM, Teon notifications@github.com wrote:

hey. is there a recommendation for the number of epochs one should use to calculate the covariance matrix? I have an experiment where there were around 1000 trials and in the past I use baseline of all of the trials. I've tried to do this with the 'auto' parameter and this process did not complete (I waited an hr and it was still iterating through the Factor Analysis).

— Reply to this email directly or view it on GitHub https://github.com/mne-tools/mne-python/issues/1637#issuecomment-73258619.

agramfort commented 9 years ago

if you have a lot of epochs FA will take too much time. Just use method=["shrunk","empirical"]. Let me know how it goes.

teonbrooks commented 9 years ago

i let it run overnight and overwhelmingly, shrunk was chosen, with the occasional minor victories for factor analysis.

larsoner commented 9 years ago

I think that we should change method='auto' to triage based on the number of epochs; if > 100 (or whatever @agramfort and @dengemann recommend) it should be equivalent to method=["shrunk", "empirical"] instead of testing the fuller set. auto isn't very useful if it locks up your computer for 12h to make insignificant modifications to the matrix.

agramfort commented 9 years ago

+1

dengemann commented 9 years ago

It depends on more things, especially the effective sample size which is determined by the low pass filter. Also you normally won't see FA winning over shrunk on single sensor data, especially after rank reduction.

dengemann commented 9 years ago

I would e.g. exclude fa by default if data are SSSed.

larsoner commented 9 years ago

I'm not sure what you mean by "single sensor data"...?

Triaging based on SSS is okay with me if that really helps, but I still think it would be nice if we could triage based on the number of epochs, plus whatever other factors matter. It doesn't have to be perfect, but we should definitely prevent freezing someone's computer for so long (I had a similar disappointing experience). @dengemann would you be satisfied if we used a rule of thumb based on number of epochs scaled by the relative width of the pass-band? Do you have a sense if this would be the right scaling?

dengemann commented 9 years ago

@Eric89GXL by single sensor I mean single sensor type, e.g. mag or grad or eeg only. FA really wins when signals are more complex.

larsoner commented 9 years ago

Is that true even when the number of epochs are high?

dengemann commented 9 years ago

Is that true even when the number of epochs are high?

FA tends to benefit from having more samples available. It will not converge if data are SSSed or heavily projected, so the procedure will climb up all possible ranks, adding more computaiton time at each iteration.

mne-tools / mne-python

FAQ: Regularizing the noise covariance matrix #1637