0todd0000 / spm1d

One-Dimensional Statistical Parametric Mapping in Python
GNU General Public License v3.0
60 stars 21 forks source link

ordinal instead of continuous within-subject factor for random effects analysis and ANCOVA implementation via GLM #279

Closed adrianrivadulla closed 2 months ago

adrianrivadulla commented 4 months ago

Hi,

I have two groups A (n = 33) and B (n = 29). These people ran to exhaustion at individually selected speeds. Each individual had a different time to completion (~10 - 25 min), and I collected data in batches of 9 min (to prevent software crashes). I am interested in the changes in kinematics associated with fatigue, differences between groups and whether people in groups A and B responded differently. To simplify the analysis, I have decided to divide the run in 3 segments: start (0%, of time to completion), mid (50%) and end (100%). I got the 50 strides around that time landmark and calculate the average stride to represent each participant at each segment.

Firstly, I tried a 2-way ANOVA with one RM factor (Group x Fatigue). I think this doesn't consider that runners ran at different speeds, which could be a confounding factor. Unless the SUBJ input gets taken into account to calculate subject specific means, which I don't think it is the case? I've also tried normalising the curves to the individual mean and std of the first segment but I think that this complicates interpretation (y axis is in individual standard deviations now) and focuses on change, neglecting the Group main effect.

I think the most appropriate analysis is an ANCOVA where speed is kept as a continuous covariate. I was looking at some of the issues and examples and I am aware that this is currently not directly implemented, and that alternative implementations may be found through GLMs. I have two approaches in mind:

  1. Hierarchical RFX model following this example Level 1 (wihtihn factor, fatigue) and Level 2 (between, group). Because I am calculating individual betas for each participant, and the magnitude and amplitude of my curves are directly related to speed, this should take into account speed. My question here is that fatigue has been "categorised" in start, mid, end. Since this is ordinal, can I still consider this as continuous and calculate within-subject regressions?

  2. ANCOVA implementation via GLM. There are a few issues on this but let's keep this one and this other one in mind for the discussion. Could I set my design matrix like:

X = [speed, groupA_bool, groupB_bool, start_bool, mid_bool, end_bool, intercept, linear_drift, sin_drift]

Where each bool indicates the group and segment and are mutually exclusive, e.g., if groupA_bool = 1, groupB_bool = 0, and if start_bool = 1, mid_bool = 0, end_bool = 0.

Since F contasts are not supported, I wouldn't be able to test the interaction of the two factors right? I guess that the most appropriate would be to again, take the betas from point 1. instead of the three segments? but with that I am back to the problem of whether it is appropriate to use regression with the start, mid, end segments.

Sorry for the long message but, does any of this make sense?

Thank you

0todd0000 commented 4 months ago

Firstly, I tried a 2-way ANOVA with one RM factor (Group x Fatigue). I think this doesn't consider that runners ran at different speeds, which could be a confounding factor. Unless the SUBJ input gets taken into account to calculate subject specific means, which I don't think it is the case?

RM-ANOVA models consider only within-subject changes and not the overall subject means, so if subjects have systematically different speeds this will not matter for RM-ANOVA.



Hierarchical RFX model following this example Level 1 (wihtihn factor, fatigue) and Level 2 (between, group). Because I am calculating individual betas for each participant, and the magnitude and amplitude of my curves are directly related to speed, this should take into account speed. My question here is that fatigue has been "categorised" in start, mid, end. Since this is ordinal, can I still consider this as continuous and calculate within-subject regressions

Since spm1d doesn't support ordinal variables directly analyses may be non-ideal, but I'd suggest running it both ways: considering fatigue as categorial in an ANOVA-type analysis, then as continuous in a regression-type analysis. The results will likely be similar.



ANCOVA implementation via GLM... Could I set my design matrix like:

X = [speed, groupA_bool, groupB_bool, start_bool, mid_bool, end_bool, intercept, linear_drift, sin_drift]

where each bool indicates the group and segment and are mutually exclusive, e.g., if groupA_bool = 1, groupB_bool = 0, and if start_bool = 1, mid_bool = 0, end_bool = 0.

Yes, that looks like a reasonable model.



Since F contasts are not supported, I wouldn't be able to test the interaction of the two factors right?

Correct. F-contrasts will be supported in the next major spm1d release (version 0.5). When using version 0.4 I'd suggest using only t contrasts in a post hoc type approach.

adrianrivadulla commented 4 months ago

Hi Todd,

Thank you for your reply. I managed to run the multilevel analysis and as you said in your message, the main conclusions are the same as with the ANOVA. I think I am going to stick to ANOVA for now and will reconsider the more complex approaches in future projects.

Thank you!