0todd0000 / spm1d

One-Dimensional Statistical Parametric Mapping in Python
GNU General Public License v3.0
61 stars 21 forks source link

non parametric posthoc tests #97

Closed zof1985 closed 5 years ago

zof1985 commented 5 years ago

Hi Todd,

I'm trying to figure out how to reliably conduct non-parametric spm1d post-hoc tests.

I'm working with a 2-way (non-parametric) repeated measures ANOVA that results in significant differences along some regions of the statistical maps for the 2 main effects but not for the interaction effect. Since each main effects are defined by 3 categories each, I want to understand what category differ from the others and where. To this purpose I used non-parametric paired t-tests. As you warn in spm1d.org, post-hoc tests are not valid because they involve separate smoothness assessments for each post hoc test. As far as I understand, n permutations are used in non-parametric spm1d to build the distribution used to calculate the critical threshold of each test. However, since the permutations are randomly selected for every test, if n is smaller than the total number of possible permutations, they theoretically can lead to different distributions and thus to errors in the estimation of the critical thresholds. Accordingly, I'm wondering if it would be correct to perform the post-hoc tests using the same n permutations used for the 2-way non-parametric repeated measures ANOVA. In my mind, this would ensure the same distribution for both the main and post-hoc tests. However, I'm not sure this would be a valid approach.

Many thanks, Luca.

0todd0000 commented 5 years ago

Hi Luca,

...since the permutations are randomly selected for every test, if n is smaller than the total number of possible permutations, they theoretically can lead to different distributions and thus to errors in the estimation of the critical thresholds.

This is true, the estimated distributions (and calculated thresholds) do indeed change when different permutations are selected. However, the distributions numerically converge as the number of permutations increases.

Since the sample size is smaller than the population size, the sample distribution will never precisely match the population distribution, even when all possible permutations are used. This may seem problematic, but it's not really a problem, because the goal is not to precisely replicate the population distribution. The goal is rather to achieve a numerically stable estimate of the population distribution. Thus it is generally desirable to choose a value of n that is smaller than the total number of possible permutations, and then to check that the estimated distributions numerically converge when re-estimated using different random permutations.

I'm wondering if it would be correct to perform the post-hoc tests using the same n permutations used for the 2-way non-parametric repeated measures ANOVA.

As far as I know this is not possible because post hoc tests use only a subset of the ANOVA data. Consider one-way ANOVA involving three groups ("A1", "A2" and "A3"). Data from all three groups is needed to generate estimates of the underlying F distribution. By contrast, a post hoc t test on A1 vs A2, for example, does not use the A3 data, because the A3 data are irrelevant to the distribution underlying the A1-A2 comparison. Since the datasets are different, it's not possible to use the same permutations in t tests and ANOVA.

Nevertheless, even if there were a way to use the same permutations, this might not be a good choice. An implicit assumption of permutation tests is that the number of observations is large enough to yield numerical convergence of the estimated distribution, irrespective of the actual permutations chosen. From this perspective, it is generally desirable to run permutations tests multiple times, with different random permutations each time, and to check for numerical convergence. The same applies to post hoc tests.

Overall I'd recommend regarding the ANOVA results as the main results, then doing anything you want for post hoc analysis provided...

  1. ...both the ANOVA and the post hoc results are numerically stable
  2. ...the post hoc results do not disagree with the main ANOVA results

From the perspective of experimental design: only the ANOVA results are relevant to the experiment's null hypothesis. Post hoc analysis simply qualifies the ANOVA results.

zof1985 commented 5 years ago

Hi Todd,

Thank you for your detailed answer. Indeed, I was assuming n large enough to obtain stable ANOVA results (about 30.000 permutations over more than 1.000.000 unique possible permutations). I then used the same number of permutations for the paired t-tests. However, the results of the latter tests quite disagree with the former. I will then try to repeat the post-hoc test several times to understand if the estimated critical thresholds are stable.

Many thanks, Luca.