tmatta / lsasim

Simulate large scale assessment data
6 stars 5 forks source link

rho #15

Closed wleoncio closed 3 years ago

wleoncio commented 3 years ago

0. Setup

I've tested most values below. Not all testings are shown in this report. I only included the testings that are showing errors/warnings or inconsistent results.

cluster_gen_2 <- function(...) {
  cluster_gen(..., verbose = FALSE, calc_weights = FALSE)
}
set.seed(12334)
n1 <- c(3, 6)
n2 <- c(groups = 4, people = 2)
n3 <- c(school = 3, class = 2, student = 5)
n4 <- c(20, 50)
n5 <- list(school = 3, class = c(2, 1, 3), student = c(20, 20, 10, 30, 30, 30))
n5a <- list(school = 3, class = c(2, 3, 3), student = c(20, 20, 10, 30, 30, 30))
n6 <- list(school = 3, class = c(2, 1, 3), student = ranges(10, 50))
n6a <- list(school = 3, class = c(2, 3, 3), student = ranges(10, 50))
n7 <- list(school = 10, student = ranges(10, 50))
n8 <- list(school = 3, student = c(20, 20, 10))
n8a <- list(school = 3, class = c(2, 2, 2),student = c(20, 20, 10))
n8b <- list(school = 3, class = c(2, 3, 3),student = c(20, 20, 10, 5))
n8c <- list(school = 3, class = c(2, 1, 3),student = c(20, 20, 10))
n9 <- list(school = 10, class = c(2,1,3,1,1,1,2,1,2,1), student = ranges(10, 50))
n10 <- list(country = 2, school = 10, class = c(2,1,3,1,1,1,2,1,2,1), student = ranges(10, 50))
n11 <- list(culture = 2, country = 2, school = 10, class = c(2,1,3,1,1,1,2,1,2,1), student = ranges(10, 50))
n12 <- list(culture = 2, country = 2, district = 3, school = 10, class = c(2,1,3,1,1,1,2,1,2,1), student = ranges(10, 50))
N1 <- c(100, 20)

5. rho

Overall suggestions

Error and warning messages

set.seed(12334)
n7 <- list(school = 10, student = ranges(1000, 5000))
r1 <- cluster_gen_2(n4, rho = c(0.005, 0.01, 0.85))
r1a <- cluster_gen_2(n4, rho = c(0.005, 0.001, 0.85))
r2 <- cluster_gen_2(n4, rho = c(0.05, 0.1, 0.45))
r3 <- cluster_gen_2(n4, rho = c(0.5, 0.45))
r4 <- cluster_gen_2(n4, rho = 0.2)

r5 <- cluster_gen_2(n7, rho = c(0.005, 0.01, 0.85))
r6 <- cluster_gen_2(n7, rho = c(0.05, 0.1, 0.45))
r7 <- cluster_gen_2(n7, rho = c(0.5, 0.45))
r8 <- cluster_gen_2(n7, rho = 0.2)

anova_table(r1)   # q1: 0.0338, q2: 0.002110038, q3: 0.8536386
##
## ANOVA table for schools, q1
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance        1.2348491          1.23484913
## 2 Between-group variance        0.0679415          0.04324451
## 3         Total variance        1.2759725                  NA
##
## Intraclass correlation
##     Estimated Standard.error
## q1 0.03383517     0.01683251
##
## Testing for group differences
## F-statistic: 2.751004 on 19 and 980 DF. p-value:  8.239076e-05
##
## ANOVA table for schools, q2
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance        2.1489794         2.148979419
## 2 Between-group variance        0.0475236         0.004544016
## 3         Total variance        2.1533006                  NA
##
## Intraclass correlation
##      Estimated Standard.error
## q2 0.002110038    0.007217181
##
## Testing for group differences
## F-statistic: 1.105725 on 19 and 980 DF. p-value:  0.3386599
##
## ANOVA table for schools, q3
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance         5.565225            5.565225
## 2 Between-group variance        32.569925           32.458621
## 3         Total variance        36.431781                  NA
##
## Intraclass correlation
##    Estimated Standard.error
## q3 0.8536386     0.04108777
##
## Testing for group differences
## F-statistic: 292.62 on 19 and 980 DF. p-value:  0
anova_table(r2)   # q1: 0.09291841, q2: 0.1539361, q3: 0.4754647
##
## ANOVA table for schools, q1
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance        4.2182094           4.2182094
## 2 Between-group variance        0.3586323           0.2742681
## 3         Total variance        4.4790249                  NA
##
## Intraclass correlation
##     Estimated Standard.error
## q1 0.06105053      0.0245658
##
## Testing for group differences
## F-statistic: 4.251002 on 19 and 980 DF. p-value:  3.672846e-09
##
## ANOVA table for schools, q2
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance        2.1057986           2.1057986
## 2 Between-group variance        0.2435799           0.2014639
## 3         Total variance        2.2973809                  NA
##
## Intraclass correlation
##    Estimated Standard.error
## q2 0.0873173     0.03157835
##
## Testing for group differences
## F-statistic: 5.783552 on 19 and 980 DF. p-value:  6.469704e-14
##
## ANOVA table for schools, q3
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance         9.569550            9.569550
## 2 Between-group variance         6.898066            6.706675
## 3         Total variance        15.947269                  NA
##
## Intraclass correlation
##    Estimated Standard.error
## q3 0.4120535     0.08166513
##
## Testing for group differences
## F-statistic: 36.04175 on 19 and 980 DF. p-value:  6.185958e-99
anova_table(r3)   # q1: 0.5526836 , q2: 0.4551122
##
## ANOVA table for schools, q1
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance         1.233057            1.233057
## 2 Between-group variance         1.448267            1.423606
## 3         Total variance         2.586836                  NA
##
## Intraclass correlation
##    Estimated Standard.error
## q1 0.5358625     0.08292473
##
## Testing for group differences
## F-statistic: 58.72671 on 19 and 980 DF. p-value:  5.54482e-147
##
## ANOVA table for schools, q2
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance        0.5971189           0.5971189
## 2 Between-group variance        0.4091505           0.3972081
## 3         Total variance        0.9748443                  NA
##
## Intraclass correlation
##    Estimated Standard.error
## q2 0.3994743     0.08098612
##
## Testing for group differences
## F-statistic: 34.26039 on 19 and 980 DF. p-value:  1.126312e-94
anova_table(r4)   # q1: 0.1916534, q2:0.1646088 q3;0.2331884
##
## ANOVA table for schools, q1
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance       0.72408847          0.72408847
## 2 Between-group variance       0.09280016          0.07831839
## 3         Total variance       0.79856542                  NA
##
## Intraclass correlation
##     Estimated Standard.error
## q1 0.09760434     0.03420395
##
## Testing for group differences
## F-statistic: 6.408068 on 19 and 980 DF. p-value:  6.883393e-16
anova_table(r5)   #0.007049268, 0.01083008, 0.8922014
## Warning in anova_table(r5): SE not yet implemented for different sample sizes
##
## ANOVA table for schools, q1
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance      0.401423629         0.401423629
## 2 Between-group variance      0.003983657         0.003783693
## 3         Total variance      0.404787202                  NA
##
## Intraclass correlation
##      Estimated Standard.error
## q1 0.009337672             NA
##
## Testing for group differences
## F-statistic: 19.92189 on 9 and 20315 DF. p-value:  1.021604e-33
## Warning in anova_table(r5): SE not yet implemented for different sample sizes
##
## ANOVA table for schools, q2
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance      0.891001362         0.891001362
## 2 Between-group variance      0.009014708         0.008570867
## 3         Total variance      0.898620568                  NA
##
## Intraclass correlation
##      Estimated Standard.error
## q2 0.009527714             NA
##
## Testing for group differences
## F-statistic: 20.3107 on 9 and 20315 DF. p-value:  1.92721e-34
## Warning in anova_table(r5): SE not yet implemented for different sample sizes
##
## ANOVA table for schools, q3
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance         4.990826            4.990826
## 2 Between-group variance        23.438161           23.435675
## 3         Total variance        25.824334                  NA
##
## Intraclass correlation
##    Estimated Standard.error
## q3 0.8244305             NA
##
## Testing for group differences
## F-statistic: 9427.634 on 9 and 20315 DF. p-value:  0
anova_table(r6)   #0.05014786    0.03788846 0.4941906
## Warning in anova_table(r6): SE not yet implemented for different sample sizes
##
## ANOVA table for schools, q1
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance        5.3112624           5.3112624
## 2 Between-group variance        0.2459524           0.2434785
## 3         Total variance        5.5273645                  NA
##
## Intraclass correlation
##     Estimated Standard.error
## q1 0.04383256             NA
##
## Testing for group differences
## F-statistic: 99.41856 on 9 and 21761 DF. p-value:  5.575734e-183
## Warning in anova_table(r6): SE not yet implemented for different sample sizes
##
## ANOVA table for schools, q2
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance        1.0801539           1.0801539
## 2 Between-group variance        0.1265839           0.1260807
## 3         Total variance        1.1920582                  NA
##
## Intraclass correlation
##    Estimated Standard.error
## q2 0.1045242             NA
##
## Testing for group differences
## F-statistic: 251.5978 on 9 and 21761 DF. p-value:  0
## Warning in anova_table(r6): SE not yet implemented for different sample sizes
##
## ANOVA table for schools, q3
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance        0.2859314           0.2859314
## 2 Between-group variance        0.1964730           0.1963398
## 3         Total variance        0.4601950                  NA
##
## Intraclass correlation
##    Estimated Standard.error
## q3  0.407115             NA
##
## Testing for group differences
## F-statistic: 1475.215 on 9 and 21761 DF. p-value:  0
anova_table(r7)   #0.5547685 0.2916394
## Warning in anova_table(r7): SE not yet implemented for different sample sizes
##
## ANOVA table for schools, q1
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance        0.6410636           0.6410636
## 2 Between-group variance        0.2991418           0.2988696
## 3         Total variance        0.9065160                  NA
##
## Intraclass correlation
##    Estimated Standard.error
## q1  0.317969             NA
##
## Testing for group differences
## F-statistic: 1099.189 on 9 and 23860 DF. p-value:  0
## Warning in anova_table(r7): SE not yet implemented for different sample sizes
##
## ANOVA table for schools, q2
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance         2.372429            2.372429
## 2 Between-group variance         1.713179            1.712172
## 3         Total variance         3.893159                  NA
##
## Intraclass correlation
##    Estimated Standard.error
## q2 0.4191773             NA
##
## Testing for group differences
## F-statistic: 1701.008 on 9 and 23860 DF. p-value:  0
anova_table(r8)   #0.1129593 0.1874756 0.2157133
## Warning in anova_table(r8): SE not yet implemented for different sample sizes
##
## ANOVA table for schools, q1
## ANOVA estimators
##                   Source Sample.statistic Population.estimate
## 1  Within-group variance        0.3495809           0.3495809
## 2 Between-group variance        0.1494783           0.1493157
## 3         Total variance        0.4816053                  NA
##
## Intraclass correlation
##    Estimated Standard.error
## q1 0.2992918             NA
##
## Testing for group differences
## F-statistic: 919.2707 on 9 and 21874 DF. p-value:  0
wleoncio commented 3 years ago

The lsasim requires "polycor" package.

Addressed on 10835f919d65efcc70c811705ff4e3e1416ca69d.

wleoncio commented 3 years ago

Warning messages: In anova_table(r5) : SE not yet implemented for different sample sizes

We need some theoretical basis to implement the equation to calculate this. Issue #22 has been opened to track this individually. For now, bee1f4312cd4057681492860f2a48b23aed3f069 implements an argument to get rid of the warning. The message itself has been enhanced to improve user experience.

wleoncio commented 3 years ago

If the rhos for different levels are varied in scales, the generated rho will be not accurate

Addressed on cd133d2977a31b653043ec188178fb1d8c197d5d.