Your take on demeaning & normalizing data before MVPA analyses

JeanneCaronGuyon commented 2 years ago

Hi all

I've seen in some papers, and in the lab I'm in now (Collignon's), authors "demeaning" the data prior to running MVPA analyses, to basically take out the univariate differences (more activation in condition A than B) and only consider patterns differences. For instance Rezk et al., 2020 : "Before running the decoding analysis in each ROI, the patterns of each block and each condition (motion and static) was demeaned individually to minimize the univariate activation level differences and to mathematically equate the mean activity of each condition and each block."

To me, that makes sense especially for cross modal decoding as in areas previously defined as "unisensory" (eg. for modality A) we will probably get way less activation for modality B, and that could influence the decoding (see for instance Smith et al. 2011). However some have warned about the potential negative effect of these methods on the interpretation of the data (I'm thinking of Ramirez's paper for instance - who talks more about RSA though).

Then, there's also the z-scoring of the patterns, again for instance Rezk et al., 2020 : "In each cross-validation fold, the training data (5 blocks per condition for the visual motion localizer, 12 blocks per condition for the auditory localizer) were normalized (z-scored) across conditions and the classifier was trained to discriminate the motion and static conditions."

We (Anne, Caro, Jean-Luc and I) have been looking at procedures to perform MVPA analyses in a more relevant way, especially as we use two different sensory modalities (either Auditory & Tactile or Visual & Tactile) and also do cross modal decoding, for which such methods could be actually pertinent.

Have you been using such methods and in which context ? Or theoretically, what's your take on their use and their impact on subsequent interpretation of the decoding performances ?

Thanks !! Jeanne

JeanneCaronGuyon commented 2 years ago

A little follow up on that, I've been even told that we should always normalize our data across conditions during the training phase... (and the extra step of demeaning the patterns would then just be used for cross-modal decoding) I see how that can help, but I really don't know if it's the classic/"right" (fair?) way to go... I'm super interested in having your input, if you have any opinion on that !! Maybe @SylvainTakerkart ? Thanks :)

SylvainTakerkart commented 2 years ago

Hi,

Here is y take on all this:

there's no absolute "better method" for data preprocessing; the better decoding performance depends on 1. the characteristics of the data, 2. the question you ask (the decoding problem), and 3. the model you use; so you are allowed to do demeaning / standardization however you want as long as you don't use the test data
about "the question you ask", the cross-modal setting is a good example... if you have hints from the "neuroscientific question you ask" that make you think that it makes sense to do standardization and/or normalization, they go for it! (at least, you'll be able to provide a neuro-driven answer to a reviewer)
standardization can be useful so that the model is not too much driven by variables that have larger variations, and it can help with some models (e.g SVM)
same kind of question: should I use single-trial beta maps as inputs or their associated t-maps... there's no answer to that! (on one hand, the t values provide some sort of normalization, on the other hand, doing such normalization might erase some interesting information from the data... we don't know)
there were some papers in the early MVPA literature that studied some part of these questions; I don't remember exactly what they said, but they showed that such method was a bit better than another for decoding performance... but: 1. are you really interested in improving decoding performance? and 2. this was done one some particular dataset, their results might not generalize to other datasets...

Well, this was a true "réponse de normand"! My conclusion is: go for what you believe in from a neuroscience point of view, at least you'll be able to defend this in front of a reviewer! As long as you don't touch the test set in your training phase, you're good from a methodological point of view!

JeanneCaronGuyon commented 2 years ago

Thanks Sylvain, I definitely see what you mean... I do believe that going with what makes sense in a neuroscience point of view feels best, and that will most likely help settle on analyses we trust, we'll see what we go for but thank you for your input on these matters !

Centre-IRM-INT / GT-MVPA-nilearn

Your take on demeaning & normalizing data before MVPA analyses #22