Discussion GSoC 2019 - Enhance statistical inference using linear regression in MNE.

JoseAlanis commented 5 years ago

Dear MNE-community, during the last couple of months, I've been working together with my mentors (@dengemann and @jona-sassenhagen) on the GSoC-project for enhancing statistical inference using linear regression in MNE-Python. As the GSoC period comes to and end, we would like to present you with some of the major achievements and trigger the discussion concerning remaining issues, considerations, and possible strategies for future work.

This a 2-3 min read, sorry for the long post.

Quick recap:

The primary goal of the GSoC project was to broaden the capabilities of MNE in terms of how the fitting of linear regression models is done, putting a particular focus on statistical inference measures and the support of more complex statistical models, which might be of common interest for the MNE-community.

Summary of major achievements:

We though the best way to address this issues would be to set up a „gallery of examples“, which allows users to browse through common research questions, providing auxiliary code for setting up and fitting linear models, as well as inspecting and visualizing results with tools currently available in NumPy, SciPy and MNE.

For this purpose we have put up a sandbox repository, which contains all the work carried our during the GSoC period. The code replicates and extends some of the main analysis and tools integrated in LIMO MEEG a MATLAB toolbox originally designed to interface with EEGLAB. The corresponding website contains examples for typical single-subject and group-level analysis pipelines.

In the following I provide a quick overview of such an analysis pipeline and the corresponding features developed during GSoC.

During the project, we've adopted a multi-level (or hierarchical) modeling approach, allowing the combination of predictors at different levels of the experimental design (trials, subjects, etc.) and testing effects in a mass-univariate analyis fashion, i.e., not only focusing on average data for a few sensors, but rather taking the full data space into account (all electrodes/sensors and at all time points of an analysis time window; see here).
Of particular importance, the analysis pipelines allow users to deal with within-subjects variance (i.e., 1st-level analysis), as well as between-subjects variance (i.e., 2nd-level analysis), by modeling the co-variation of subject-level parameter estimates and inter-subject variability in some possible moderator variable (see here).
This hierarchical approach consist in estimating linear model parameters for each subject in a data set (this is done at each time-point and sensor independently). At the second-level beta coefficients obtained from each subject are integrated across subjects to test for statistical significance.
The implemented methods correspond to tests performed using bootstrap under H1 to derive confidence intervals (i.e., providing a measure of consistency of the observed effects on a group level) and "studentized bootstrap" (or bootstrap-t) to provide an approximation of H0, and control for multiple testing (e.g., via spatiotemporal clustering techniques).

Open questions:

One of the main issues concerns the integration of the fitting tools to MNE's API.
- So far, we've been using scikit-learn's linear regression module to fit the models.
- The advantage here consists in having a linear regression "object" as output, increasing the flexibility for manipulation of the linear model results (t-values, p-values, measures of model fit, etc.), while leaving MNE's linear regression function untouched (for now).
- However, we believe that using a machine learning package for linear regression might irritate users on the long run.
- What are your thoughts on this? One strategy could be to modify, or simplify, MNE's linear regression function to obtain similar output. Here, we would still be doing the linear algebra our selves and avoid (unnecessary?) excursions into scikit-learn.
The second major issue concerns the inference part.
- At the moment we are using parts of MNE's cluster_level code to run spatiotemporal clustering, which in principle mimics the behavior of mne.stats.cluster_level._permutation_cluster_test, but uses bootstrap to threshold the results.
- Thus perhaps the easiest approach would be integrate bootstrap in mne.stats.cluster_level._permutation_cluster_test or extract the cluster stats from mne.stats.cluster_level._permutation_cluster_test without permutation and submit these to bootstrap in a second function.

There are a couple of other issues, but since this post is already too long, it might be best to discuss them later (or on the issue section of out GSoC repository), also a PR for more in-depth code discussion follows shortly.

I really enjoyed working on this project during the summer and would be glad to continue working on these tools after GSoC.

Thanks for reading and looking forward to your feedback.

dengemann commented 5 years ago

cc @agramfort @larsoner

jasmainak commented 5 years ago

hi @JoseAlanis, thanks for all the hard work during your GSoC. It is what pushes the frontiers of open and reproducible science forward. This is an impressive amount of work and I have only one tiny feedback.

It would be great if you have some free time in improving the documentation (both on the MNE side and in the sandbox repository) now that you are familiar with the tools. With regards to the examples, even before diving into any stats or regression, it would be nice to show what the data is all about because many of us don't know what the LIMO dataset contains and what is the metadata in there. Wrapping some things in convenience functions and exposing an API to make the examples shorter etc. would be a priority for me.

larsoner commented 5 years ago

However, we believe that using a machine learning package for linear regression might irritate users on the long run.

I don't think this is a real problem. Installing Python packages nowadays is a lot easier than it used to be so relying on sklearn seems fine.

The second major issue concerns the inference part.

There are private functions that do the clustering step. These should already be separate from the ones that choose and iterate over permutations, etc. but if they aren't, we can separate them better. Then we could have *_bootstrap_* in place of *_permutation_* public functions.

JoseAlanis commented 4 years ago

Hey guys, thanks a lot for your feedback. I opened a PR for improving the documentation on the LIMO dataset and also adding a proposal for how a subject-level regression function could look like. The we could take the output of that function and use it for group-level inference. Looking forward to your comments.

mne-tools / mne-python