Update calls to MVPA-Light in ft_statistics_mvpa

treder commented 3 years ago

Remove the following lines in case they don't apply

Please ask the community on the discussion list for help on using FieldTrip.
Please see here if you want to contact the lead developpers privately.

treder commented 3 years ago

Hi, I'm phasing out the functions mv_crossvalidate and mv_searchlight since their functionality is contained in my_classify (they were initially written before mv_classify existed). The functions will eventually disappear from the toolbox so it'd be great to replace the mv_crossvalidate / mv_searchlight calls by mv_classify calls.

Should be a straightforward fix, I can also do this myself and post a pull request if that's useful?

schoffelen commented 3 years ago

Hi Matthias, yes that would be great. Or at least you can initiate a PR and that we work on the draft together

treder commented 3 years ago

This sounds good, I'll get in a PR soon and we can take it from there.

treder commented 3 years ago

Reopening this since I had a look into _ft_statisticsmvpa and there's couple of questions before I start. MVPA-Light (short ML in the following) has grown quite a lot in the last year and parts of the interface changed considerably, so I think there needs to be a major overhaul of _ft_statisticsmvpa. Just to summarize:

1) _mvcrossvalidate and mv_searchlight are phased out as I mentioned earlier. The surviving classification functions are _mv_classify_acrosstime and _mv_classifytimextime, I keep these mostly for user convenience. The newer function _mvclassify was written after we first designed the FT interface. It abstracts and extends all the functionality in the other functions i.e. it can work with data of any dimension, do searchlight along any dimension, and generalization along any dimension. I think _mv_classifytimextime is tiny bit more efficient for time generalization, but in principle we could only use mv_classify throughout _ft_statisticsmvpa - this could make the interface simpler.

2) ML can now do MVPA for regression, too, using the function _mvregress. This should be easy to include: cfg.design can be either class labels (classification) or responses (regression). Depending on which model the users specifies we call _mvclassify (e.g. when the model is 'lda') or _mvregress (e.g. when the model is 'ridge').

3) There is now a nested preprocessing pipeline. This should be easy to include using a new field cfg.mvpa.preprocess, so I can just give it a go.

4) There is also a better statistical testing framework in ML now. Eg it supports level 1 (single subject) and level 2 (group level) statistical testing of MVPA metrics. Would be good to pick your brain on what's the best way to integrate it with FT. Actually Level 2 is simply group-level statistical testing of the metrics, for this I reproduce some FT functionality (such as cluster permutations); this can be entirely done within FT. What could be addressed in _ft_statisticsmvpa is Level 1 (getting p-values for the metrics for a given participant), because the MVPA toolbox is required e.g. for permutation testing (=rerunning the classification analysis many times with permuted class labels). The latter shouldn't be difficult to implement by just adding a few fields to cfg.mvpa.

Regarding the cfg interface for _ft_statisticsmvpa, I propose the following changes as a first go:

cfg.mvpa.replace/balance/normalise: these parameters need to be removed, they don't exist in ML any more, their functionality has been absorbed into the (more general) preprocessing framework
cfg.mvpa.size: remove, this parameter does not exist any more
cfg.mvpa.param -> rename to cfg.mvpa.hyperparameter (to be consistent with ML)
cfg.mvpa.classifier ->rename to cfg.mvpa.model (to make it work for both classification and regression)
cfg.mvpa.group: new parameter required for a new cross-validation scheme using 'predefined' folds
cfg.timextime: we can now perform generalization across any dimension (e.g. frequency x frequency, channel x channel etc), therefore I propose to remove this field and instead have a field cfg.mvpa.generalize which can be set to designate the generalization dimension (if any). This parameter can be numerical (eg "2" means 2nd dimension in dat.trial) or a string (e.g. 'time') when the dimensions are named.
cfg.searchlight: the idea of a searchlight has also been generalized in the toolbox. In ML the interface works slightly different: E.g. in mv_classify you specify the sample_dimension and the feature_dimension. Say you set sample_dimension=1 and feature_dimension=2. If your data is [samples x channels x time] a classification is performed for every time point, using channels as features (searchlight across time). If feature_dimension=3, the time points serve as features and a separate classification is performed for every channel (searchlight across channels). So one option would be to take the MVPA-Light approach and have a field cfg.mvpa.feature_dimension to define the features (all dimensions that are not features or samples are then designated as searchlight dimensions). Alternatively, if this is more user friendly, one can set cfg.mvpa.searchlight directly (all dimensions that are not samples or searchlight then serve as features). I have some preference for the former since even for higher-dimensional data (e.g. time-frequency) there's often only one feature dimension whereas there can be many searchlight dimensions.

Sorry this became quite a long post :D whenever you get to it, it'd be great to get your thoughts on (any of) these.

schoffelen commented 3 years ago

Hi Matthias, I am all in favor of not having to rely on different functions for different functionality. I am fine with the changes. What we should try to aim for, is to have all this take place 'under the hood', i.e. build in some intelligence that automatically renames the set of arguments into new-style definitions, or throw informative warnings, so that backward compatility is compromised as little as possible. Also, we'd need to go over the documentation as well (tutorial page). I have been making some changes to reflect the updated behavior of your code (already upgrading ft_statistics_mvpa to use mv_classify in some of the cases), but I think that this can only be improved. I'll need to check, whether the website changes only exist on my end, or whether this has already been pushed to the website repo.

treder commented 3 years ago

OK sounds all good, I will make sure to put it some downward compatibility checks. Let me know whether you want to push any changes before I get started.

schoffelen commented 3 years ago

No, please go ahead.

With respect to config checking and backwardcompatibility stuff, we typically rely on ft_checkconfig. For reference, you can check with line 106 and beyond in ft_statistics_montecarlo how it can be used.

treder commented 3 years ago

OK great, will have a look.

fieldtrip / fieldtrip

Update calls to MVPA-Light in ft_statistics_mvpa #1624