0todd0000 / spm1d

One-Dimensional Statistical Parametric Mapping in Python
GNU General Public License v3.0
61 stars 21 forks source link

Confidence interval plot #119

Closed emiltoft closed 4 years ago

emiltoft commented 4 years ago

Dear Todd,

When comparing two data sets a use "spm1d.stats.ttest2" or "spm1d.stats.nonparam.ttest2" following the interference call. To illustrate the data I usually use the "spm1d.plot.plot_mean_sd". I found that it also was possible to plot with a confidence interval instead using "ci = spm1d.stats.ci_twosample" and then ci.plot(ax=ax). However, for nonparametric data the ci call will execute iterations an additional iteration of the dataset to estimate the ci's. I guess these iterations would not be identical? Is there a way to plot the confidence interval from the initial interference call based on the ttest2 function alone, or without overhead of the iteration-procedure?

Kind regards, Emil

0todd0000 commented 4 years ago

Dear Emil,

It is possible to achieve identical results with spm1d.stats.nonparam.ttest2 and spm1d.stats.ci_twosample if both:

Here is an example:

#(0) Load dataset:
dataset = spm1d.data.uv0d.ci2.AnimalsInResearch()
yA,yB   = dataset.get_data()

#(1) Compute confidence intervals:
np.random.seed(0)
alpha      = 0.05
iterations = 200
ci         = spm1d.stats.ci_twosample(yA, yB, alpha, datum='meanA', mu='meanB')
np.random.seed(0)
cinp       = spm1d.stats.nonparam.ci_twosample(yA, yB, alpha, datum='meanA', mu='meanB', iterations=iterations)

#(2) Conduct nonparametric t test:
np.random.seed(0)
t          = spm1d.stats.nonparam.ttest2(yA, yB).inference(alpha, iterations=iterations)

#(3) Print results:
print( 'Thresholds:' )
print( f'   Parametric:                   {ci.zstar}' )
print( f'   Nonparametric (ci_twosample): {cinp.zstar}' )
print( f'   Parametric (ttest2):          {t.zstar[1]}' )

The results are:

Thresholds:
   Parametric:                   2.036933343460101
   Nonparametric (ci_twosample): 2.1818183197900787
   Parametric (ttest2):          2.1818183197900787

With spm1d's current interface It is not possible to avoid calling both nonparam.ci_twosample and nonparam.ttest2, but you could avoid this double-call by customizing the function ci_twosample function (below, copied from ./spm1d/stats/nonparam/ci.py):

def ci_twosample(yA, yB, alpha=0.05, datum='difference', mu=None, iterations=-1):
    spmi         = ttest2(yA, yB).inference(alpha, two_tailed=True, iterations=iterations)
    spmi         = _snpmi2spmi(spmi)
    JA,JB        = yA.shape[0], yB.shape[0]
    spmparam     = ttest2_parametric(yA, yB, equal_var=True)
    mA,mB        = spmparam.beta            #sample means
    s            = spmparam.sigma2**0.5     #sample standard deviation
    hstar        = spmi.zstar * s * (1./JA + 1./JB)**0.5
    if datum == 'difference':
        CIClass  = CITwoSampleDifference1DNP if spmi.dim==1 else CITwoSampleDifference0DNP
        ci       = CIClass(spmi, mA-mB, hstar, mu)
    else:
        CIClass  = CITwoSample1DNP if spmi.dim==1 else CITwoSample0DNP
        ci       = CIClass(spmi, mA, mB, hstar, mu)
    return ci

For example, instead of return ci, you could use return ci,spmi to return both the confidence interval and the SPM object, thereby avoiding the double-call.

Todd