0todd0000 / spm1d

One-Dimensional Statistical Parametric Mapping in Python
GNU General Public License v3.0
60 stars 21 forks source link

inconsistencies #281

Closed liokainobrega closed 2 months ago

liokainobrega commented 4 months ago

We have encountered some inconsistencies during our analyses in gait analysis. Upon running the same data multiple times, we noticed variations that raised questions.

When analyzing the curve of the hip in the frontal plane, for example, we observed fluctuations in the values of p and critical t (using the ttest2 test). Anybody knows why this can happen?

Hip t test HIP

0todd0000 commented 4 months ago

If you are using parametric inference (e.g. spm1d.stats.ttest2) then no fluctuations are expected.

If instead you are using nonparametric inference (e.g. spm1d.stats.nonparam.ttest2) then fluctuations are indeed expected; this is the nature of the nonparametric permutation method that is implemented in spm1d.

Consider the following code:

alpha      = 0.05

np.random.seed(0)
tA = spm1d.stats.nonparam.ttest2(yA, yB).inference(alpha, iterations=500)
tB = spm1d.stats.nonparam.ttest2(yA, yB).inference(alpha, iterations=500)  # different from tA
np.random.seed(0)
tC = spm1d.stats.nonparam.ttest2(yA, yB).inference(alpha, iterations=500)  # identical to tA

Here the tA and tC results are equivalent because np.random.seed controls the random number generator state, and thereby controls the observation label permutations that spm1d randomly selects.

Note also that, while the tB results are numerically different, they are expected to converge to the tA results for a large number of iterations. Usually iterations=500 or iterations=1000 is suitable for rough convergence, but iterations=1e4, iterations=1e5 or even iterations=1e6 may be preferable for final results.