0todd0000 / spm1d

One-Dimensional Statistical Parametric Mapping in Python
GNU General Public License v3.0
61 stars 21 forks source link

partially pared samples t-test #198

Closed karentroy closed 2 years ago

karentroy commented 2 years ago

Hi there, I have two samples that are partially paired (some have condition A only, some condition B only and some are paired samples with both conditions). I can calculate an alternate t-statistic using the method outlined by Derrick here: https://digitalcommons.wayne.edu/cgi/viewcontent.cgi?article=2251&context=jmasm .

By hand, for a single 0D value, I can calculate a new T-statistic and degrees of freedom. It comes out with a p-value somewhere between an independent 2-sample t-test and a paired t-test (not surprisingly). I would like to use a similar method to adopt the SPM1D code. However, the way the glm.m code is written, it's not clear to me how I might do this.

I can calculate the alternate t-statistic at each timepoint by hand without much trouble, but I cannot figure out how to calculate the critical test zstar value, which seems to be the crux of it. For example, when I put two dummy data sets in to the SPM1d code and calculate zstar based on either an independent 2-sample ttest vs a paired ttest, the zstar values differ quite a lot. I'm not sure I understand why.

Thanks! Karen

0todd0000 commented 2 years ago

Hello, sorry for the delay!

I don't think that there is a theoretical solution to this problem in the SPM literature. The key difficulty is that the critical t statistic calculation requires a single degrees-of-freedom (DF) estimate. A similar problem exists in other cases, like unequal variance / nonsphericity estimates, which use DF adjustments. Theoretical SPM solutions exist for the nonsphericity case, wherein a single DF estimate is used for the entire 1D domain. These solutions are relatively complex, and I'm unsure whether a similar approach can be applied to this alternate t statistic case.

As an approximation I'd suggest:

  1. Calculate t(q) and df(q): the t statistic and DF at each point q (as you have already done).
  2. Use mean(df) (across the 1D domain) as the overall DF estimate.
  3. Use this mean DF estimate in spm1d.rft1d.t.isf to calculate the critical threshold.

Note that Step 2 is a hack, and almost certainly not theoretically robust. However, provided the DF estimates do not vary widely across the domain (q), the result should be similar to a theoretically robust solution.

karentroy commented 2 years ago

Thanks - that's really helpful. Also, thanks for sharing the code and the doing such a great job explaining it so that we can use it! Much appreciated!

0todd0000 commented 2 years ago

No problem, thank you for the feedback!