0todd0000 / spm1d

One-Dimensional Statistical Parametric Mapping in Python
GNU General Public License v3.0
61 stars 21 forks source link

Nonparametric results interpretation #155

Closed 0todd0000 closed 3 years ago

0todd0000 commented 3 years ago

(The question below is copied from an email conversation)

We calculate t values for statistically different groups of signals, and we would like to compare these values (distribution of data is not normal). Can we use additional SPM in non-normally distributed SPM data ( nonparametric procedures in spm1d.stats.nonparam)? How can we interpret the t value in further analysis (not just for hypothesis purpose)? We believe that the beginning of the statistical difference between normal and potentiated muscle contraction can be interesting information. The maximal value of t over the threshold and the significance of t area over the threshold can be useful information about the intensity and duration of potentiation and fatigue. Integral of t over the threshold can be defined as potentiation impulse.

0todd0000 commented 3 years ago

Can we use additional SPM in non-normally distributed SPM data ( nonparametric procedures in spm1d.stats.nonparam)?

Yes, nonparametric procedures are suitable for any distribution.

How can we interpret the t value in further analysis (not just for hypothesis purpose)?

In spm1d, nonparametric t tests use the exact same t value as parametric tests, so the interpretation of the t value is identical to parametric procedures. The t value effectively indicates effect size, but unlike effect size, its probabilistic properties are precisely known. Just as effect sizes can be interpreted in contexts other than hypothesis testing, so too can t values.

We believe that the beginning of the statistical difference between normal and potentiated muscle contraction can be interesting information.

SPM procedures can indeed be used to detect the beginning of signals, but this is not their primary purpose. Their primary purpose is to conduct domain-wide tests, where "domain" usually means 0% - 100% time. If the main purpose of your analysis is to detect signal onset, then an event detection procedure like this one might be more appropriate. SPM results will likely be similar to event-detection results, but not necessarily.

The maximal value of t over the threshold and the significance of t area over the threshold can be useful information about the intensity and duration of potentiation and fatigue. Integral of t over the threshold can be defined as potentiation impulse.

These may indeed be interesting metrics to consider.

# This code is copied from ./spm1d/examples/nonparam/1d/ex_ttest2.py

import numpy as np
from matplotlib import pyplot
import spm1d

#(0) Load dataset:
dataset    = spm1d.data.uv1d.t2.PlantarArchAngle()
yB,yA      = dataset.get_data()  #normal and fast walking

#(1) Conduct non-parametric test:
np.random.seed(0)
alpha      = 0.05
two_tailed = True
snpm       = spm1d.stats.nonparam.ttest2(yA, yB)
snpmi      = snpm.inference(alpha, two_tailed=two_tailed, iterations=500)

cluster    = snpmi.clusters[0]

print( cluster )
[Cluster (NonParam)
    threshold       :  -3.023
    centroid        :  (96.607, -4.414)
    isinterpolated  :  True
    iswrapped       :  False
    endpoints       :  (92.218, 100.000)
    extent          :  7.782
    metric          :  ClusterIntegral
    metric_value    :  11.11875
    nPermUnique     :  184756 unique permutations possible
    nPermActual     :  500 actual permutations
    P               :  0.00000
 ]

You'll see that the metric is ClusterIntegral and that its value is 11.11875. You can access the value directly using the metric_value property:

x = cluster.metric_value
print( x )
11.118746826689936