Closed 0todd0000 closed 1 year ago
Answers:
Yes, this is valid. The s_i values are regarded as the dependent variable, and a normal two-sample test is usually fine. However, one may need to check the distributions. SD values cannot be less than zero, so if many SD values are close to zero it is possible that the distribution may be non-normal.
Yes, the tests are different. The equality of variance test tests the hypothesis that the population variances are identical. A two-sample test on the s_i values can be related, but since the s_i values represent within-subject variance, the two tests can yield very different results. Consider the artificial data in the tables below.
Group A:
Subj | |||||
---|---|---|---|---|---|
1 | 205 | 201 | 202 | 206 | 205 |
2 | 200 | 204 | 202 | 200 | 201 |
3 | 192 | 201 | 202 | 197 | 206 |
4 | 200 | 201 | 197 | 194 | 198 |
5 | 196 | 195 | 194 | 205 | 198 |
6 | 197 | 201 | 198 | 196 | 199 |
7 | 197 | 198 | 197 | 194 | 200 |
8 | 202 | 200 | 203 | 196 | 201 |
Group B:
Subj | |||||
---|---|---|---|---|---|
1 | 390 | 409 | 398 | 398 | 404 |
2 | 403 | 414 | 397 | 403 | 391 |
3 | 385 | 400 | 398 | 415 | 414 |
4 | 401 | 412 | 412 | 396 | 396 |
5 | 395 | 387 | 407 | 383 | 397 |
6 | 404 | 400 | 403 | 393 | 396 |
7 | 395 | 383 | 404 | 390 | 400 |
8 | 393 | 391 | 394 | 396 | 400 |
These data were generated using the Python script below. The true population variances SA and SB values are: SA = SB = 5
, and the true s_i values are 3 and 10, for groups A and B, respectively. For these data, the Levene test for equal variance yields p=0.152
, and a two-sample t test on the s_i values yields p=0.001
. Clearly equal-variance tests and test of variances (SDs) are considering different aspects of variability; the former tests SA vs. SB and the latter tests s_i.
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
# specify sample and population parameters
N = 8 # number of subjects
n = 5 # number of measurements per subject
SA = 5 # true between-subject SD, Group A
SB = 5 # true between-subject SD, Group B
sA = 3 # true within-subject SD, Group A
sB = 10 # true within-subject SD, Group B
muA = 200 # true population mean, Group A
muB = 400 # true population mean, Group B
# generate dataset:
np.random.seed(0)
yA = []
yB = []
for i in range(N):
yA.append( muA + sA * np.random.randn(n) )
yB.append( muB + sB * np.random.randn(n) )
yA = np.asarray( yA, dtype=int )
yB = np.asarray( yB, dtype=int )
# conduct eqality of variance test:
mA = yA.mean(axis=1) # within-subject means, Group A
mB = yB.mean(axis=1) # within-subject means, Group B
res = stats.levene(mA, mB)
print('Equality of variance test: p = %.3f' %res.pvalue)
# conduct two-sample test on variances:
sA = yA.std(axis=1, ddof=1) # within-subject SDs, Group A
sB = yB.std(axis=1, ddof=1) # within-subject SDs, Group B
res = stats.ttest_ind(sA, sB)
print('Two-sample test, WS SDs: p = %.3f' %res.pvalue)
Results:
Equality of variance test: p = 0.152
Two-sample test, WS SDs: p = 0.001
(This is paraphrased from an email discussion)
Imagine a simple experiment involving:
Questions: