0todd0000 / spm1d

One-Dimensional Statistical Parametric Mapping in Python
GNU General Public License v3.0
61 stars 21 forks source link

Smoothness and statistical power #246

Closed 0todd0000 closed 10 months ago

0todd0000 commented 1 year ago

(The questions below are paraphrased from an email discussion.)



Q. Is greater smoothness preferable? Is greater smoothness associated with greater statistical power?

Q. Is a two-sample comparison valid if one group (e.g. "Post") has greater smoothness than another group (e.g. "Pre")?

Q. If one group has greater smoothness than the other, does this increase power?

0todd0000 commented 1 year ago

Q. Is greater smoothness preferable? Is greater smoothness associated with greater statistical power?

A. Smoothness is just a characteristic of the data, similar to the standard deviation (SD). Like the SD it needs to be estimated from the data. So asking whether greater smoothness is preferable is similar to asking whether a lower SD is preferable. A lower SD is generally associated with greater power, so in this sense it may indeed be preferable to a larger SD. However, the SD is not usually something that can be controlled because it depends on the measurement system and the nature of the measured phenomenon. One simply uses the measured data to calculate the sample SD, which serves as an estimate of the true population-level SD. Identically, one uses the measured data to estimate population smoothness. Also identically, greater smoothness is generally associated with greater power.



Q. Is a two-sample comparison valid if one group (e.g. "Post") has greater smoothness than another group (e.g. "Pre")?

A. Yes. Non-identical smoothness across groups may be referred to as "heterogeneous smoothness". The figure below is from Pataky et al. (2019) and depicts different smoothness characteristics, including heterogeneity as depicted in panel (d). This manuscript demonstrates that SPM can indeed validly handle the heterogeneous case, with the minor caveat that cluster-level p-values can be marginally inaccurate (by about 1%).

terminology



Q. If one group has greater smoothness than the other, does this increase power?

A. Yes, but it depends what your reference is. Like above, let us say that the two groups are "Pre" and "Post" and that the "Post" group exhibits markedly smoother data. If one were to regard "Pre" as the datum for comparison, where one assumes that all populations / tasks have identical smoothness to "Pre" then this is correct: the greater "Post" smoothness leads to greater power with respect to that "Pre" datum scenario.

However, one needn't make the assumption that all groups have the same smoothness as some datum condition. Instead one can assume that smoothness may be different from group-to-group, and then deal with this different smoothness during statistical inference. This perspective is identical to the "unequal variance" perspective in multi-group comparisons including t-tests and ANOVA; occasionally it is justifiable to assume that all groups have equal variance, but in general it is more appropriate to assume that they may have different variances, and to estimate each group's variance based on the data, then compute relevant probabilities based on those estimates. These unequal variance scenarios are theoretically well-developed and handled automatically in most statistical software. One can --- and perhaps ought to --- regard smoothness in the same manner: inferences and conclusions regarding the groups should be limited to the observed / measured smoothness characteristics. From this perspective it doesn't really make sense to ask whether greater smoothness increases power, just as it doesn't really make sense to ask whether lower SD in one group increases power. One estimates SD / smoothness then bases one's inferences on those estimates.



References

Pataky, T. C., Vanrenterghem, J., Robinson, M. A., & Liebl, D. (2019). On the validity of statistical parametric mapping for nonuniformly and heterogeneously smooth one-dimensional biomechanical data. Journal of Biomechanics, 91, 114-123. https://doi.org/10.1016/j.jbiomech.2019.05.018

jahogg commented 1 year ago

Hi Todd, Thank you for your helpful explanations. Digging deeper on the final question--in the case of heterogenous smoothness, would it follow that the ordering of the curves would alter significance testing? I.e., if the smoother curve were considered the reference curve, would this elicit the same result as if the non-smooth curve were the reference? Lastly, I gather that heterogenous curves would result in minimal error (1%). Would this indicate that an unequal variance adjustment (e.g., a Levene's test) is not needed? Would there ever be a case in which you would recommend an adjustment to account for extremely heterogenous curves?

0todd0000 commented 1 year ago

...in the case of heterogenous smoothness, would it follow that the ordering of the curves would alter significance testing? I.e., if the smoother curve were considered the reference curve, would this elicit the same result as if the non-smooth curve were the reference?

The order should not affect results. If you regard the reference as fixed, then smoother-than-reference data will increase power and rougher-than-reference data will decrease power.



Lastly, I gather that heterogenous curves would result in minimal error (1%).

I don't think that there is a direct association between smoothness heterogeneity and error because one can regard all measurements as equally accurate.



Would this indicate that an unequal variance adjustment (e.g., a Levene's test) is not needed?

A dataset can have heterogeneous smoothness but constant variance, both across the 1D domain and constant within groups, so there not necessary a direct association between heterogeneity and variance. So unequal variance is I believe an issue that is independent of heterogeneity. smp1d handles unequal variance for two-sample comparisons and one-way ANOVA, but not yet for m-way ANOVA where m>1.



Would there ever be a case in which you would recommend an adjustment to account for extremely heterogenous curves?

If the data are so heterogenous that that they are not homologous then they should probably not be compared using any method, including SPM. If the data remain homologous despite heterogeneity then likely SPM would be suitable. However, more robust modeling and treatment of heterogeneity may require methods from the functional data analysis (FDA) literature.