ANOVA 3tworm - Within-subject study design - interpretation of the results, pilot testing, reporting/disclosing pilot results in the article

Hi Todd,

First of all, thank you so much for setting up this support page for the SPM analysis.

I designed a within-subject experiment where I had each subject perform 2 types of jumps with 3 types of foot progression angles (6 conditions total if you will) and collected data from 20 subjects (10 males and 10 females).

It seemed like this was the appropriate test for what I intended to test - spmlist = spm1d.stats.anova3tworm(Y, A, B, C, SUBJ); therefore...

I assigned the kinematic data aka the dependent variable to Y (120x101)
and let A be the gender (1x120) 0 or 1 - 0==female; 1==male;
B be the drop jump type (1x120) 0 or 1 - 0==DJ' 1==RDJ;
and C be the FPA (foot progression angles) (1x120) - 1,2 or 3 - 1==N;2==I;3==O;

The 2 repeated measures factors are the variables I manipulated (B and C).

This is the result that I got and it makes sense that the main effect was coming from C, the foot progression angles.

Fig1

Given that A and B (gender and jump type) seem to have very little impact on Y compared to C, I demoted my test to a One-way repeated-measures ANOVA with one RM factor and combined the jump type and FPA into 1 which led to 6 conditions

dropjumptype1_FPA1
dropjumptype1_FPA2
dropjumptype1_FPA3
dropjumptype2_FPA1
dropjumptype2_FPA2
dropjumptype2_FPA3.

So what used to be B x C is now A and A goes from 1:6 instead of 1:3 with the condition codes found above. [spm1d.stats.anova1rm(Y, A, SUBJ);]
Looking at this {F} curve I can see that it is somewhat similar to the one above, however, the magnitude of the F is cut by about 3 times. The way I am interpreting this is that it's almost completely unnecessary to create 6 conditions because pairs of conditions 1 & 4, 2 & 5, and, 3 & 6 are similar in how they behave. Ultimately the FPA factor seems to dictate the behavior of Y and the drop jump type factor only shunned the larger effect of FPA when I combined these two factors. Fig2

So I went back to creating 3 conditions and set A=(1,2,3,1,2,3) for each subject (similar to anova3tworm) -> spm = spm1d.stats.anova1rm(Y, A, SUBJ); This gave us the following F curve which is the same as the main C of Fig 1.

Fig3

I have the following questions now:

Am I on the right path with my interpretation of the results?
Having conducted all of these pilot tests in the publication/article I plan to highlight the substantial impact of FPA on the dependent variable while acknowledging the non-significant effects of gender (given the importance of gender-disparity results in my field of study) and jump type based on the statistical analyses conducted. Would you say that the way I have my test set up for the first test Fig 1 is the best way for me to maintain the strength of my study design as opposed to clumping factors together as I did in the second test Fig 2?
After conducting post hoc tests (paired-t-tests seem to be the most appropriate ones for my study design) is there a good way to show visually an odd number of tests (I have 15, so I did 3 separate figures of 5x2,4x2,3x2,3x2 where the top row are the t scores for each pair and the bottom row consist of the corresponding Mean +- SD graph, see below)? Just show the significant graphs?

Fig4

Thanks again and I look forward to hearing back from you.

Thank you for the feedback!

Am I on the right path with my interpretation of the results?

Yes, except I don't think it is suitable to combine factors in Step 2. The factor "FPA" has just three levels. It is likely inappropriate to create six levels because the levels are not independent.

Having conducted all of these pilot tests in the publication/article I plan to highlight the substantial impact of FPA on the dependent variable while acknowledging the non-significant effects of gender (given the importance of gender-disparity results in my field of study) and jump type based on the statistical analyses conducted. Would you say that the way I have my test set up for the first test Fig 1 is the best way for me to maintain the strength of my study design as opposed to clumping factors together as I did in the second test Fig 2?

Yes. Additionally, please note:

I recommend using just one study design. The design is meant to match the experiment as closely as possible. It is generally not suitable to use multiple models to represent one experiment unless the point of the analysis is model comparison. In this case the first three-way ANOVA model sounds most appropriate.
If you wish to conclude a null effect then generally power analysis is necessary to protect against Type II error. Not all reviewers may ask for power analysis, and the first figure's results are rather compelling, so it may be sufficient to simply make the claim cautiously with phrasing like "These results suggest that FACTOR A and FACTOR B had comparatively small impacts on DEPENDENT VARIABLE but power analyses and an experimental redesign focussing on these two factors would be necessary to more robustly make this conclusion."

After conducting post hoc tests (paired-t-tests seem to be the most appropriate ones for my study design) is there a good way to show visually an odd number of tests (I have 15, so I did 3 separate figures of 5x2,4x2,3x2,3x2 where the top row are the t scores for each pair and the bottom row consist of the corresponding Mean +- SD graph, see below)? Just show the significant graphs?

This is difficult to answer because (a) it depends on what you wish to emphasize and (b) spm1d does not directly support this type of figure creation. If the point is to provide transparency and show all results then probably any format is fine.

Thank you so much for your feedback, Todd. I want to ask a couple more questions before I close this thread.

Would you say that adding a couple more trials for each condition for my entire cohort of subjects would help protect against Type II errors thus perhaps implying more robust and definitive conclusions?

If you wish to conclude a null effect then generally power analysis is necessary to protect against Type II error. Not all reviewers may ask for power analysis, and the first figure's results are rather compelling, so it may be sufficient to simply make the claim cautiously with phrasing like "These results suggest that FACTOR A and FACTOR B had comparatively small impacts on DEPENDENT VARIABLE but power analyses and an experimental redesign focussing on these two factors would be necessary to more robustly make this conclusion."

Should I need to perform a Power Analysis would that be possible with MATLAB like this past issue suggests: https://github.com/0todd0000/spm1dmatlab/issues/54. I wasn’t able to locate anything via a google search and was wondering if the Python version https://spm1d.org/power1d/ is the only version available.
This is a comparison for a different dependent variable. I am using the same test as above spmlist = spm1d.stats.anova3tworm(Y, A, B, C, SUBJ);

When you get multiple main variables that are significant (see above for MAIN A which is gender and MAIN C which is the FPA) and you perform the post hoc tests as I did below, how similar do you expect the graphs to be considering one is an F test and the other is a paired-t test (main test w 2 variables compared vs post hoc w 2 variables compared)? **This is a different (dependent) variable that I am now comparing for each condition Picture2 emf

Last and final question:

Since my experiment's main focus is the COM and I had bilateral jumps (i.e. subject jumped and landed with both legs leading to a very small laterality/asymmetry or so I would think), instead of just randomly choosing a side (L or R knee, hip as previous studies have done), I employed a more rationale-driven approach by considering the proximity of the COM during the initial phase of the stance in bilateral jumps and chose the "side" for analysis based on the relative medial-lateral position of the COM during the initial phase of the stance (the first 30 points of the stance phase - velocity phase dictated- 30 points for most if not all subjects were half of the eccentric phase of jumping).

Fig 1 and 2. The dependent variable when COM was more ipsilateral or contralateral in the medial-lateral direction com_distal

com proximal

Given that the Main C is even more significant now factoring in the COM proximity compared to when I randomly picked a side, would you think the differences between COM ipsi vs COM contralateral (Fig 1 and 2 above) are large enough for me to add this as a factor in the RM analysis? I do not love the idea of separating and independently running a 3-way RMANOVA for both the ipsilateral and contralateral COM mostly due to redundancy. Any thoughts?

Thanks a lot again! Leo

Would you say that adding a couple more trials for each condition for my entire cohort of subjects would help protect against Type II errors thus perhaps implying more robust and definitive conclusions?

More subjects will increase power but more trials will likely not. If conclusions pertain to the population of subjects then only increasing the number of subjects will increase power. However, arbitrarily adding subjects is not necessarily a good idea because then you run the risk of over-powering the study, wherein tiny and unimportant effects can be detected.

Should I need to perform a Power Analysis would that be possible with MATLAB ?

power1d is currently available only for Python.

When you get multiple main variables that are significant (see above for MAIN A which is gender and MAIN C which is the FPA) and you perform the post hoc tests as I did below, how similar do you expect the graphs to be considering one is an F test and the other is a paired-t test (main test w 2 variables compared vs post hoc w 2 variables compared)? **This is a different (dependent) variable that I am now comparing for each condition

Post hoc results are expected to be qualitatively quite similar to the main ANOVA results. You must use the same dependent variable in post hoc tests.

...would you think the differences between COM ipsi vs COM contralateral (Fig 1 and 2 above) are large enough for me to add this as a factor in the RM analysis?

This is a modeling question so it is difficult to answer. My general advice is: you must choose a model that most accurately represents your experiment. If you do not then you must clearly justify why you are choosing a different model.

0todd0000 / spm1dmatlab

ANOVA 3tworm - Within-subject study design - interpretation of the results, pilot testing, reporting/disclosing pilot results in the article #190