pitakakariki / simr

Power Analysis of Generalised Linear Mixed Models by Simulation
69 stars 19 forks source link

`along` "works" for variables in the global environment (was: Simulate smaller sample sizes instead of larger ones) #161

Open kalenkovich opened 4 years ago

kalenkovich commented 4 years ago

NB: this is a cross-post of this StackOverflow question.

I am estimating sample size based on the data from a previous experiment that included 40 participants.

I used simr::powerCurve for several sample sizes smaller than the original one:

pc <- powerCurve(fit = model, nsim = 100, alpha=0.02, 
                 breaks = c(10, 20, 30, 40), along = 'subject_id')

The results are identical for all sizes and are close to 100%. I assume this is due to the simulated sample size being smaller than the original one.

Is there a way to estimate power for sample sizes smaller than the one used to fit the model?

Here is a reproducible example using synthetic data (code taken from https://humburg.github.io/Power-Analysis/simr_power_analysis.html and adapted slightly):

library(simr)

subj <- factor(1:40)
class_id <- letters[1:5]
time <- 0:2
group <- c("control", "intervention")

subj_full <- rep(subj, 15)
class_full <- rep(rep(class_id, each=10), 3)
time_full <- rep(time, each=50)
group_full <- rep(rep(group, each=5), 15)

covars <- data.frame(id=subj_full, class=class_full, treat=group_full, time=factor(time_full))

## Intercept and slopes for intervention, time1, time2, intervention:time1, intervention:time2
fixed <- c(5, 2, 0.1, 0.2)

## Random intercepts for participants clustered by class
rand <- list(0.5, 0.1)

## residual variance
res <- 2

model <- makeLmer(y ~ treat + time + (1|class/id), fixef=fixed, VarCorr=rand, sigma=res, data=covars)

pc <- powerCurve(model, test = fixed('treat'), nsim=100, along='subj', breaks = c(10, 20, 30, 40))
print(pc)

The output is

Power for predictor 'treat', (95% confidence interval),
by number of levels in subj:
     10: 100.0% (96.38, 100.0) - 150 rows
     20: 100.0% (96.38, 100.0) - 300 rows
     30: 100.0% (96.38, 100.0) - 450 rows
     40: 100.0% (96.38, 100.0) - 600 rows

Time elapsed: 0 h 0 m 55 s
pitakakariki commented 3 years ago

along='subj' should probably generate an error here...