mne-tools / mne-python

MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python
https://mne.tools
BSD 3-Clause "New" or "Revised" License
2.67k stars 1.31k forks source link

BUG: threshold free cluster enhancement when 1 values has std=0. #2158

Closed kingjr closed 6 years ago

kingjr commented 9 years ago

I am using spatio_temporal_cluster_1samp_test with a dict threshold so as to perform a threshold free cluster enhancement (TFCE) on GeneralizationAcrossTime (GAT) matrices.

I want to test whether some lines of the GAT matrix significantly differ from the diagonal of the matrix across subjects.

This difference (X - np.tile(np.diag(X), [X.shape[0], 1])) automatically makes the value on the diagonal equal to 0 for all subjects.

In consequence, the diagonal variance across subjects is equal to 0, the default stat_func (ttest_1samp_no_p ) generates nan values, the stop threshold parameter becomes nan, no threshold are tested, and the resulting p_values all equal 1.

I see three options

Here is a full example:

import numpy as np
from mne.stats import spatio_temporal_cluster_1samp_test as STC1samp

n_observations = 10
n_times = 10
X = np.random.randn(n_observations, n_times, n_times)
# add information
X[:, n_times / 2:, :] += 1.

threshold = dict(start=1., step=1.)

# classic comparison
_, _, p_values, _ = STC1samp(X, threshold=threshold, n_jobs=-1, verbose=False)
print(np.min(p_values))

# compare diagonal with off diagonal elements:
Y = X - np.tile([np.diag(x) for x in X], [n_times, 1, 1]).transpose(1, 0, 2)
_, _, p_values, _ = STC1samp(Y, threshold=threshold, n_jobs=-1, verbose=False)
print(np.min(p_values))

def stat_fun(x, sigma=0, method='relative'):
    from mne.stats import ttest_1samp_no_p
    t_values = ttest_1samp_no_p(x, sigma=sigma, method=method)
    t_values[np.isnan(t_values)] = 0
    return t_values

_, _, p_values, _ = STC1samp(Y, threshold=threshold, n_jobs=-1,
                             stat_fun=stat_fun, verbose=False)
print(np.min(p_values))
dengemann commented 9 years ago

Excluding the diagnoal values seems the easies to me. Least effort to implement. And why do you wanr to test features where testing does not make any sense (the diagonal in that case)?

2015-05-28 14:14 GMT+02:00 J-R King notifications@github.com:

I am using spatio_temporal_cluster_1samp_test with a dict threshold so as to perform a threshold free cluster enhancement (TFCE) on GeneralizationAcrossTime (GAT) matrices.

I want to test whether some lines of the GAT matrix significantly differ from the diagonal of the matrix across subjects.

This difference (X - np.tile(np.diag(X), [X.shape[0], 1])) automatically makes the value on the diagonal equal to 0 for all subjects.

In consequence, the diagonal variance across subjects is equal to 0, the default stat_func (ttest_1samp_no_p ) generates nan values, the stop threshold parameter becomes nan, no threshold are tested, and the resulting p_values all equal 1.

I see three options

  • leave it like that, and convince me to exclude the diagonal values (-1)
  • in mne.stats._find_clusters define the stop key of threshold with np.nanmax, istead of np.max (+1?)
  • have a robust ttest_1samp_no_p which output 0 if t_values is nan(+1?)

Here is a full example:

import numpy as np from mne.stats import spatio_temporal_cluster_1samp_test as STC1samp

n_observations = 10 n_times = 10 X = np.random.randn(n_observations, n_times, n_times)

add information

X[:, n_times / 2:, :] += 1.

threshold = dict(start=1., step=1.)

classic comparison

, , pvalues, = STC1samp(X, threshold=threshold, n_jobs=-1, verbose=False) print(np.min(p_values))

compare diagonal with off diagonal elements:

Y = X - np.tile([np.diag(x) for x in X], [ntimes, 1, 1]).transpose(1, 0, 2) , _, pvalues, = STC1samp(Y, threshold=threshold, n_jobs=-1, verbose=False) print(np.min(p_values))

def stat_fun(x, sigma=0, method='relative'): from mne.stats import ttest_1samp_no_p t_values = ttest_1samp_no_p(x, sigma=sigma, method=method) t_values[np.isnan(t_values)] = 0 return t_values

, , pvalues, = STC1samp(Y, threshold=threshold, n_jobs=-1, stat_fun=stat_fun, verbose=False) print(np.min(p_values))

— Reply to this email directly or view it on GitHub https://github.com/mne-tools/mne-python/issues/2158.

dengemann commented 9 years ago

Althouth nanmean and nanstd could be options for the ttest_1samp ... But then you need to backport those, they are not available in old numpy versions.

2015-05-28 14:22 GMT+02:00 Denis-Alexander Engemann < denis.engemann@gmail.com>:

Excluding the diagnoal values seems the easies to me. Least effort to implement. And why do you wanr to test features where testing does not make any sense (the diagonal in that case)?

2015-05-28 14:14 GMT+02:00 J-R King notifications@github.com:

I am using spatio_temporal_cluster_1samp_test with a dict threshold so as to perform a threshold free cluster enhancement (TFCE) on GeneralizationAcrossTime (GAT) matrices.

I want to test whether some lines of the GAT matrix significantly differ from the diagonal of the matrix across subjects.

This difference (X - np.tile(np.diag(X), [X.shape[0], 1])) automatically makes the value on the diagonal equal to 0 for all subjects.

In consequence, the diagonal variance across subjects is equal to 0, the default stat_func (ttest_1samp_no_p ) generates nan values, the stop threshold parameter becomes nan, no threshold are tested, and the resulting p_values all equal 1.

I see three options

  • leave it like that, and convince me to exclude the diagonal values (-1)
  • in mne.stats._find_clusters define the stop key of threshold with np.nanmax, istead of np.max (+1?)
  • have a robust ttest_1samp_no_p which output 0 if t_values is nan (+1?)

Here is a full example:

import numpy as np from mne.stats import spatio_temporal_cluster_1samp_test as STC1samp

n_observations = 10 n_times = 10 X = np.random.randn(n_observations, n_times, n_times)

add information

X[:, n_times / 2:, :] += 1.

threshold = dict(start=1., step=1.)

classic comparison

, , pvalues, = STC1samp(X, threshold=threshold, n_jobs=-1, verbose=False) print(np.min(p_values))

compare diagonal with off diagonal elements:

Y = X - np.tile([np.diag(x) for x in X], [ntimes, 1, 1]).transpose(1, 0, 2) , _, pvalues, = STC1samp(Y, threshold=threshold, n_jobs=-1, verbose=False) print(np.min(p_values))

def stat_fun(x, sigma=0, method='relative'): from mne.stats import ttest_1samp_no_p t_values = ttest_1samp_no_p(x, sigma=sigma, method=method) t_values[np.isnan(t_values)] = 0 return t_values

, , pvalues, = STC1samp(Y, threshold=threshold, n_jobs=-1, stat_fun=stat_fun, verbose=False) print(np.min(p_values))

— Reply to this email directly or view it on GitHub https://github.com/mne-tools/mne-python/issues/2158.

kingjr commented 9 years ago

And why do you wanr to test features where testing does not make any sense (the diagonal in that case)?

The idea is that you want to know whether the classifier is stable across time: figure 3 black lines versus purple lines: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0085791

kingjr commented 9 years ago

Althouth nanmean and nanstd could be options for the ttest_1samp ... But then you need to backport those, they are not available in old numpy versions.

if the np.isnan exists, that's not too hard to do

larsoner commented 9 years ago

I don't know anything about this application, but I want to point out that we already have a "robust" ttest_1samp, use ttest_1samp_no_p(..., sigma=1e-3) and you should be good. It constrains the minimum variance to be at least some factor smaller than the maximum variance when doing a bunch of comparisons. But it only makes sense to use if you expect your variances to be on the same order of magnitude.

larsoner commented 6 years ago

Closing because the "hat" method mentioned above is designed for this case