Cluster-based stats improvements - potential GSoC (and future)

mmagnuski commented 6 years ago

Mirroring the FieldTrip API

Eric @larsoner suggested to mirror fieldtrip's cluster-based API in mne python:

Mikolaj, perhaps instead of making a new API proposal, you could try Python-izing the FieldTrip API. No need to reinvent the wheel here if theirs is complete / extensible enough already, and it would have the added benefit of making user code porting simpler.

That makes sense, although I would prefer to change a few things, notably the design, ivar and uvar which are quite confusing for newcomers. I can start with showing how an almost exact copy of fieldtrip's API would look (I omit method='montecarlo', because we are considering now mirroring only the cluster-based API - or maybe not, as @jona-sassenhagen suggests):

# within-subjects t-test
from mne.stats import ttest_rel_no_p, cluster_based_statistic

conditions = ['condA', 'condB']

erp_list = list()
for epochs in epochs_list:
    erp_list += [epochs[cond].average() for cond in conditions]

# in fieldtrip, because all the files/trials are passed in a cell array (or
# just varargin) you need additional info sepcifying the design:
conditions = np.array([0, 1] * n_subjects)
subjects = np.array([[i, i] for i in range(n_subjects)]).ravel()
design = np.stack([subjects, conditions], axis=0)

# later in the function `uvar` denotes the row containing 'units of observation',
# (here these are single subject identifiers), while `ivar` denotes levels of independent
# variable. Additonally we have adjacency structure in `adjacency` var

cluster_based_statistic(erp_list, statistic=ttest_rel_no_p, alpha=0.025,
                        cluster_alpha=0.05, cluster_statistic='maxsum',
                        min_nb_chan=2, neighbours=adjacency, tail=0,
                        cluster_tail=0, n_permutations=1000, design=design,
                        uvar=0, ivar=1)
# the alpha is set to 0.025 above to correct for two-tail, fieldtrip also
# allows to set another parameter to have two-tail cluster p values returned

I like the Fieldtrip’s cluster-based API in general but not the design, ivar, uvar approach. This is something that would require some further thought. Passing a real design matrix might be ok to add some time later for more advanced users, but the basic API should be much simpler. A sketchy idea I have:

erp_list = {cond: [epochs[cond].average() for epochs in epochs_list]
            for cond in cond_list}

cluster_based_statistic(erp_list, statistic=ttest_rel_no_p,
                        design='condA - condB', neighbours=adjacency)

and then in ANOVA cases, rm_anova API could be mirrored with factor_levels arg (or levels to also accommodate linear regression) and effects could be passed to design (which could be called effects instead of design BTW).

Additionally times or freqs (when dealing with TFR or spectra) could be passed to crop the search space (this is be similar to fieldtrip’s API). The next step would be to return some kind of object that holds the results together and allows for selecting clusters and plotting/inspecting results. But that’s another story (I had sketched an API for such object some time ago but couldn’t dig it up now).

So, overall, the fieldtrip API is not so different from what we have in mne, but following things are missing:

support for mne objects and simple interface for at least dependent and independent t-tests
(then adding anova and regression but that could all be followups)
support for 3d data (channel x frequency x time)
min_nb_chan filtering (other forms would be nice too, but as far as I remember they are not supported in fieldtrip)
separating tail and cluster_tail (minor detail, not that useful in most cases)
step_down is nice in mne and I think it should be included in top-level cluster API
some rudimentary viz for cluster-based

Other

I also attach the more general points I sketched in the first mail (those not covered above):

expose a public clustering function (one that works for 1d, 2d and 3d) - so that users can build their more exotic cluster-based approaches more easily (currently all clustering functions are private IIRC)
return an object that holds standard cluster-based results with simple easy API (currelntly 4 variables are returned) along with save and viz functionality (ideally, long-term)
ability to read and write fieldtrip format cluster-based results (conventionally stat in ft)
new cluster-based functions: cluster-based regression; using permutation tests within cluster-based (requires pre-computing the permutation distribution for the ‘smaller’ permutation tests that are used in cluster-based)

Other points gathered from the discussion below:

allow for data to be passed as a dictionary:

erp_list = {cond: [epochs[cond].average() for epochs in epochs_list] for cond in cond_list}

define a relatively high level intermediate API that works with only arrays and some sort of high level dimension specification
use Patsy for the design matrices.
consider using HED tags
prefer an API that is agnostic to what method you use for correction for multiple tests:
```
mass_univariate(erp_list, method=('fdr' | 'cluster' | ...), ...)
```
support regression from the start
show how to do baseline contrasts (#608)
allow threshold='tfce'? (#5534)

jona-sassenhagen commented 6 years ago

If we have an object with cluster-based results, I will write methods functions for it :)

jona-sassenhagen commented 6 years ago

I'm late here, but:

erp_list = {cond: [epochs[cond].average() for epochs in epochs_list] for cond in cond_list}

This is also the API used by plot_compare_evokeds, so it would be a nice consistency.

larsoner commented 6 years ago

@christianbrodbeck any feedback on the API proposal based on what you have in eelbrain?

jona-sassenhagen commented 6 years ago

Some further thoughts:

I think it would be pythonic to use Patsy for the design matrices.
We should also consider using HED tags
I am not sure how compatible the aforementioned points are
I would prefer an API that is agnostic to what method you use for correction for multiple tests, i.e., I'd prefer mass_univariate(erp_list, method=('fdr' | 'cluster' | ...), ...
if it's not too much work, support regression from the start
would be super neat if the output is an object with viz and maybe even save methods

christianbrodbeck commented 6 years ago

Not sure, I haven't really been following this after implementing stats in eelbrain. I would not want to stop using data tables with meaningful labels and data objects with arbitrary dimensions. But maybe it would be useful to define a relatively high level intermediate API that works with only arrays and some sort of high level dimension specification?

mmagnuski commented 6 years ago

I agree with @jona-sassenhagen on all points but consider them a characterisation of the end goal, we will need a lot of small steps to get there. API-wise having mass_univariate instead of cluster-specific sounds good to me, but the returned objects would be different for cluster and non-cluster stats. The patsy + plot_compare dict style is something I considered - this might work best for t tests and anova, and generally setups where we don't have many predictors, consider:

data: dictionary of condition -> list of mne data objects mappings (with keys like l2/single) levels: {'memory_load': ['l2', 'l3', 'l4'], 'task': ['single', 'dual']} design: 'memory_load + task + memory_load * task'

however once someone is interested in modelling effects of time on task for example, this API would be problematic (but could work well with meta dataframes).

jona-sassenhagen commented 6 years ago

however once someone is interested in modelling effects of time on task for example, this API would be problematic (but could work well with meta dataframes).

Or regexp: rt* -> rt10, rt11, rt12

agramfort commented 6 years ago

one option would be to start a separate repo to allow easier try and errors and then integrate to mne core when API is more settled.

mmagnuski commented 6 years ago

@agramfort - Sure, a separate repo for early steps sounds reasonable.

@jona-sassenhagen

Or regexp: rt* -> rt10, rt11, rt12

I'm not sure I understand - you mean to always use the epochs metadata then? We could either have use_metadata kwarg or automatically check if passed epochs have metadata with columns corresponding to levels.

mmagnuski commented 6 years ago

I updated the first post a little and will be progressively organizing it as our ideas shape up. Currently the early API direction seems to be to:

allow for dicts of data
use levels kwarg for specifying levels of factors/predictors
use epochs metadata (in within-subject analysis) when mentioned in levels (maybe with use_metadata safety kwarg)
allow for patsy-like formulas

For within-subject analyses we would have to adopt the 2-step approach (unless someone wants to mess with mixed models): estimating parameters for individual subjects and then testing whether these args are different from zero for example.

larsoner commented 6 years ago

One thing that would help us here that we don't have is a tutorials/plot_background_statistical_contrasts.py that shows how to do commonly needed contrasts in Python starting with data in lists of arrays, probably with scipy.stats, patsy, and statsmodels:

t-test (1 sample, 2 sample)
1-way ANOVA (repeated measures, independent)
2-way ANOVA (test for main effects, interaction term x repeated measures / mixed / independent)
Others?

Ideally it would become clear from this tutorial how / why the dict approach can be used to simplify things. Is there some volunteer for this?

If we can settle on clear best practices for such a tutorial, then the clustering API probably does not have to add too much complexity on top of it. The "only" added complexity of clustering has to do with how to properly define adjacency among multiple dimensions, I think. So the mass_univariate API does seem quite appealing.

jona-sassenhagen commented 6 years ago

Im quite open to working on such a tutorial. It’s perhaps the biggest hurdle for people doing basic ERP work who want to use MNE rght now. At the last MNE tutorials I gave, this was the least smooth step.

mmagnuski commented 6 years ago

@larsoner - might that be better suited to add to the biomag mne demos? Or maybe we can use the files from the faces dataset here (in main mne repo)?

larsoner commented 6 years ago

No, I don't want to start with multidimensional Neuroimaging data. Let's make fake data about subject reaction time or something else that is truly univariate. If we want it to be like MEG data let's pretend that the data are peak amplitude or latency or something. But we should use synthetic data for this.

larsoner commented 6 years ago

@jona-sassenhagen if you have time to work on this please do. There will be overlap with plot_background_statistics.py but we can refactor later.

larsoner commented 5 years ago

Just a minor update, I have heard from folks recently that statsmodels has simplified some of the processes of getting your data into the right format for within/between contrasts.

@christianbrodbeck any idea if your testnd functionality could be ported here to take care of some of these problems? It looks like you have made some efforts to generalize our stats stuff.

christianbrodbeck commented 5 years ago

@larsoner I would be all for it, but the history is completely separate from the mne stats functions (and so the implementation structure is probably also quite different). As far as the front-end is concerned, @agramfort was against including the model specification API I am using in mne, and for the backend I'm using Cython quite a lot...

larsoner commented 5 years ago

@christianbrodbeck can you look at the top post to see how much overlap there would be between what @mmagnuski has summarized as our so-far consensus, and what you already have?

What do you use Cython for? Just the clustering step? Does it end up being much faster?

mne-tools / mne-python

Cluster-based stats improvements - potential GSoC (and future) #4859

Mirroring the FieldTrip API

Other