0todd0000 / spm1d

One-Dimensional Statistical Parametric Mapping in Python
GNU General Public License v3.0
61 stars 21 forks source link

Insignificant post-hoc paired t-test, in presence of sig RM-ANOVA interaction #71

Closed bernard-liew closed 6 years ago

bernard-liew commented 6 years ago

Dear Todd,

How have you been? I think this question is a general question for any ANOVA based statistics, rather than SPM per se.

I have a significant interaction effect, between two predictor variables, each with two levels, and used a RM-ANOVA. However, there wasn't any interaction between any of the 4 paired t-tests I did as a post-hoc, at least at the phase of gait where significance was found in the primary test.

I was wondering if you could channel me to some answers.

regards, Bernard

0todd0000 commented 6 years ago

Hi Bernard!

That result sounds a bit strange, I too would expect to see something in post hoc analysis if there were a significant interaction effect. So I wonder if it could be a scripting error? Can you please copy-and-paste your script into this discussion?

Todd

bernard-liew commented 6 years ago

Dear Todd,

Sorry my script is really untidy by any coder's standards. This pertains specifically to knee transverse ("z" axis) angle ie. "kz" in script. If you need my data, I would be happy to send you.

Regards, Bernard

Specifiy file path and add to python path

filepath = os.path.join('C:\Users\liew_\Desktop\SPManalysis\') sys.path.append(filepath)

VAR = 'CutJtAngleC'

Read in datasets

d = pd.read_table (os.path.join (filepath,''.join([VAR,'.txt'])))

scale = pd.read_table (os.path.join (filepath,'ScaleFact.txt'))

Subset load carriage side

dcut= d[(d.SIDE == 'L')]

def coding(col, codeDict): colCoded = pd.Series(col, copy=True) for key, value in codeDict.items(): colCoded.replace(key, value, inplace=True) return colCoded

dcut["LOAD"] = coding (dcut["LOAD"] , {'BW':0,'BK':1}) dcut["CUE"]= coding (dcut["CUE"], {'AN':0,'UN':1})

Create dataframe for each joint and each axes

ankXang = dcut [(dcut.JT == "AK") & (dcut.AXES == "X")] ankYang = dcut [(dcut.JT == "AK") & (dcut.AXES == "Y")] ankZang = dcut [(dcut.JT== "AK") & (dcut.AXES == "Z")] kneXang = dcut [(dcut.JT == "KN") & (dcut.AXES == "X")] kneYang = dcut [(dcut.JT == "KN") & (dcut.AXES == "Y")] kneZang = dcut [(dcut.JT == "KN") & (dcut.AXES == "Z")] hipXang = dcut [(dcut.JT == "HP") & (dcut.AXES == "X")] hipYang = dcut [(dcut.JT == "HP") & (dcut.AXES == "Y")] hipZang = dcut [(dcut.JT == "HP") & (dcut.AXES == "Z")]

Plots to check for waveform outliers

dcheck = np.transpose(np.asarray(hipZang.iloc[:,8:109])) pyplot.plot(dcheck, color='k')

Filter out waveforms with obvious outliers.

kneXang = kneXang[((kneXang.ix[:,7:108] > -80)).all(axis=1)] kneYang = kneYang[((kneYang.ix[:,70:85] <10)).all(axis=1)] hipXang = hipXang[((hipXang.ix[:,80:108] < 60)).all(axis=1)]

ankXang = ankXang .groupby(['SUBJ','LOAD','CUE', 'VAR', 'AXES'], as_index = False).mean() ankYang = ankYang.groupby(['SUBJ','LOAD','CUE', 'VAR', 'AXES'], as_index = False).mean() ankZang = ankZang .groupby(['SUBJ','LOAD','CUE', 'VAR', 'AXES'], as_index = False).mean() kneXang = kneXang .groupby(['SUBJ','LOAD','CUE', 'VAR', 'AXES'], as_index = False).mean() kneYang = kneYang .groupby(['SUBJ','LOAD','CUE', 'VAR', 'AXES'], as_index = False).mean() kneZang = kneZang .groupby(['SUBJ','LOAD','CUE', 'VAR', 'AXES'], as_index = False).mean() hipXang = hipXang.groupby(['SUBJ','LOAD','CUE', 'VAR', 'AXES'], as_index = False).mean() hipYang = hipYang .groupby(['SUBJ','LOAD','CUE', 'VAR', 'AXES'], as_index = False).mean() hipZang = hipZang .groupby(['SUBJ','LOAD','CUE', 'VAR', 'AXES'], as_index = False).mean()

Sort so that a common frame of predictors can be created

ankXang = ankXang.sort_values(['SUBJ', 'LOAD', 'CUE' ], ascending=[True, True, True]).sort_index() ankYang = ankYang.sort_values(['SUBJ', 'LOAD', 'CUE' ], ascending=[True, True, True]).sort_index() ankZang = ankZang.sort_values(['SUBJ', 'LOAD', 'CUE' ], ascending=[True, True, True]).sort_index() kneXang = kneXang.sort_values(['SUBJ', 'LOAD', 'CUE' ], ascending=[True, True, True]).sort_index() kneYang = kneYang.sort_values(['SUBJ', 'LOAD', 'CUE' ], ascending=[True, True, True]).sort_index() kneZang = kneZang.sort_values(['SUBJ', 'LOAD', 'CUE' ], ascending=[True, True, True]).sort_index() hipXang = hipXang.sort_values(['SUBJ', 'LOAD', 'CUE' ], ascending=[True, True, True]).sort_index() hipYang = hipYang.sort_values(['SUBJ', 'LOAD', 'CUE' ], ascending=[True, True, True]).sort_index() hipZang = hipZang.sort_values(['SUBJ', 'LOAD', 'CUE' ], ascending=[True, True, True]).sort_index()

Create input data

SUBJ = kneYang['SUBJ']

B = (kneYang['CUE']).values

A = (kneYang ['LOAD']).values

ax = (ankXang.drop(['SUBJ','LOAD','CUE','VAR','AXES','TRIAL'], axis=1, inplace=False)).values ay = (ankYang.drop(['SUBJ','LOAD','CUE','VAR','AXES','TRIAL'], axis=1, inplace=False)).values az = (ankZang.drop(['SUBJ','LOAD','CUE','VAR','AXES','TRIAL'], axis=1, inplace=False)).values kx = (kneXang.drop(['SUBJ','LOAD','CUE','VAR','AXES','TRIAL'], axis=1, inplace=False)).values ky = (kneYang.drop(['SUBJ','LOAD','CUE','VAR','AXES','TRIAL'], axis=1, inplace=False)).values kz = (kneZang.drop(['SUBJ','LOAD','CUE','VAR','AXES','TRIAL'], axis=1, inplace=False)).values hx = (hipXang.drop(['SUBJ','LOAD','CUE','VAR','AXES','TRIAL'], axis=1, inplace=False)).values hy = (hipYang.drop(['SUBJ','LOAD','CUE','VAR','AXES','TRIAL'], axis=1, inplace=False)).values hz = (hipZang.drop(['SUBJ','LOAD','CUE','VAR','AXES','TRIAL'], axis=1, inplace=False)).values

ang = [ax,ay,az,kx,ky,kz,hx,hy,hz]

Perform Two-way ANOVA with repeated-measures on two factors

t, t0, ti, ti0 = [],[],[],[]

for n in range(len(ang)): t0 = spm1d.stats.anova2rm( ang[n], A, B, SUBJ) t.append(t0)

for x in range (len(t)): ti0 = t[x].inference(alpha=0.05) ti.append(ti0)

for x in range (len(t)): for i in range (3): print ti[x][i].clusters

%%

Post hoc paired t-test for ankle and knee transverse angle

x = kz # kz for knee transverse angle x = [x [(A == 0) & (B == 0)], #BW_AN x [(A == 0) & (B == 1)], #BW_UN x [(A == 1) & (B == 0)], #BK_AN x [(A == 1) & (B == 1)]] #BK_UN

Perform Two-way ANOVA with repeated-measures on one factor

p, p0, pi, pi0 = [],[],[],[]

p0 = spm1d.stats.ttest_paired(x[0], x[1]) # BW_AN vs BW_UN p.append(p0) p0 = spm1d.stats.ttest_paired(x[0], x[2])# BW_AN vs BK_AN p.append(p0) p0 = spm1d.stats.ttest_paired(x[1], x[2])# BW_UN vs BK_AN p.append(p0) p0 = spm1d.stats.ttest_paired(x[1], x[3])# BW_UN vs BK_UN p.append(p0)

for j in range (len(p)): pi0 = p[j].inference(alpha=0.05) pi.append(pi0) print pi[j].clusters

0todd0000 commented 6 years ago

Hi Bernard,

The code looks fine but I see two potential issues with respect to probing the interaction effect.

  1. The first point pertains to the two commands below. The first command assembles a number of variables, including kz, but then the second command appears to use only kz in post hoc tests. Are you seeing the interaction in the kz variable?
ang = [ax,ay,az,kx,ky,kz,hx,hy,hz]
x = kz # kz for knee transverse angle
  1. The second point pertains to the post hoc tests in the code snippet below (I've left out the append commands for brevity).
p01 = spm1d.stats.ttest_paired(x[0], x[1]) # BW_AN vs BW_UN
p02 = spm1d.stats.ttest_paired(x[0], x[2]) # BW_AN vs BK_AN
p12 = spm1d.stats.ttest_paired(x[1], x[2]) # BW_UN vs BK_AN
p13 = spm1d.stats.ttest_paired(x[1], x[3]) # BW_UN vs BK_UN

Note that the interaction effect pertains to the following means (leaving out variance for simplicity):

m0 = x[0].mean(axis=0)  #mean 1D continuum for BW_AN
m1 = x[1].mean(axis=0)  #mean 1D continuum for BW_UN
m2 = x[2].mean(axis=0)  #mean 1D continuum for BK_AN
m3 = x[3].mean(axis=0)  #mean 1D continuum for BK_UN

difference01 = m1 - m0  #mean 1D difference continuum (BW_AN vs. BW_UN)
difference23 = m3 - m2  #mean 1D difference continuum (BK_AN vs. BK_UN)

interaction = difference23 - difference01  #interaction effect (ignoring variance)

Thus you won't see interactions when you look at only single pairs of m0, m1, m2 and m3. In order to see the interaction effect in post hoc analysis you'd need to first construct difference continua, then submit those difference continua to a t test like this:

d01 = x[1] - x[0]  #BW_AN vs. BW_UN differences for each subject 
d23 = x[3] - x[2]  #BK_AN vs. BK_UN differences for each subject 

t = spm1d.stats.ttest_paired(d23, d01)

In other words, the interaction compares (BW_UN - BW_AN) to (BK_UN - BK_AN).

Todd

bernard-liew commented 6 years ago

Dear Todd,

Qn 1: Yes I did the post hoc on Kz, as it was one of two out of 9 response variables that had a significant interaction.

Qn2: I do understand your solution.

Your solution of (BW_UN - BW_AN) to (BK_UN - BK_AN), means finding the effects of UN vs AN at the two levels of BW/BK. That is an interaction.

Am I correct to add further that this must be supplemented by an additional interaction of (BK_UN-BW_UN) to (Bk_AN- BW_AN), the effects of BK vs BW at two levels of UN?AN.

My miss understanding: I got the impression for the paired t test for all pairs in the http://www.spm1d.org/doc/PostHoc/multivariate.html section. More that it was a misinterpretation on my part.

However, I would have thought that the paired comparison of BW_UN vs BK_AN and BW_AN vs BK_UN would be equivalent to an interaction.

Regards, Bernard

0todd0000 commented 6 years ago

Hi Bernard,

There is no need to test the cited interaction because it is actually equivalent to the first one. Note that:

x = (BW_UN - BW_AN) - (BK_UN - BK_AN)

is equivalent to:

x = (BW_UN - BK_UN) - (BW_AN - BK_AN)

Todd

bernard-liew commented 6 years ago

Hi Todd,

Sorry if I appear confused. Let me say my two factors are primary factor: load (BW and BK) and secondary factor: cue (AN and UN)

In your post you performed

interaction = difference23 - difference01

When I reflected on it, it seems to be the same as the interaction found in the RM anova? However, when I find an interaction between the primary and secondary factors, isn't the post hoc attempting to find out what is the difference in change in load, at different levels of cue? Because the interaction (difference23 - difference01) doesn't really tell me where the difference lie, only that the interaction is/isn't significant, a finding I already had from the anova interaction. So shouldn't the post hoc be

test 1 = paired t test (BW_AN - BK_AN) [ effect of load at AN cue] test 2 = paired t test (BW_UN- BK_UN)[ effect of load at UN cue]

What I am expecting to find is a significance in either test1 or test 2 [ assuming similar effect directions]?

Regards, Bernard

0todd0000 commented 6 years ago

Hi Bernard,

To probe interaction effects you need to consider all four groups as discussed above; it is not possible to see interactions by looking only at two groups at a time.

Tests 1 and 2 above cannot represent the interaction because they only consider two groups at once.

Todd

bernard-liew commented 6 years ago

Thanks Todd, Ahhh... yes I get it. But that isn't that already the interaction obtained from RM anova (interaction = difference23 - difference01)?

Regards, Bernard

0todd0000 commented 6 years ago

Yes, that's correct, but only for 2x2 ANOVA. For MxN ANOVA the interaction can be between arbitrary levels of the two factors.

Todd

bernard-liew commented 6 years ago

many thanks Todd again. That is clear now.