pitakakariki / simr

Power Analysis of Generalised Linear Mixed Models by Simulation
68 stars 19 forks source link

Power of 0% and different error messages #260

Closed romeodetimmerman closed 1 year ago

romeodetimmerman commented 1 year ago

Hello there,

I am trying to run a power analysis based on a mixed effects logistic regression model containing an interaction effect between two fixed factors, and two random factors which are nested.

When running a variety of powerSim commands, I consistently get power levels of 0% and 2 types of error messages when running lastResult()$errors: "Test did not return a p-value" and "subscript out of bounds".

This is the model: blues.m1 <- glmer(AAE_feature ~ Group*Form + (1|Artist/Song), control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=2e5)), data = blues, family=binomial)

When running the following powerSim command, I get 0% values and "subscript out of bounds" error messages for each simulation: powerSim(blues.m1, test = fixed("Group:Form"), nsim = 10)

Warning: This appears to be an "observed power" calculationPower for predictor 'Group:Form', (95% confidence interval):========================================================================| 0.00% ( 0.00, 30.85) Test: z-test Based on 10 simulations, (2 warnings, 10 errors) alpha = 0.05, nrow = 3863 Time elapsed: 0 h 0 m 54 s nb: result might be an observed power calculation

Using fcompare also runs into problems, but this time the error messages read "Test did not return a p-value": powerSim(blues.m1, test = fcompare(~ Group * Form), nsim = 10)

Warning: This appears to be an "observed power" calculationPower for model comparison, (95% confidence interval):==============================================================================| 0.00% ( 0.00, 30.85) Test: Likelihood ratio Comparison to ~Group * Form + [re] Based on 10 simulations, (0 warnings, 10 errors) alpha = 0.05, nrow = 3863 Time elapsed: 0 h 1 m 50 s nb: result might be an observed power calculation

Similar issues arise when using powerCurve. I get a power level of 0% for each break, and the sample size of ID does not increase and instead stays at n = 3,863: powerCurve(blues.m1, along = "ID", test = fixed("Group:Form", "z"), nsim = 10, breaks=c(4000, 16000, 32000, 40000))

Warning: This appears to be an "observed power" calculationPower for predictor 'Group:Form', (95% confidence interval),========================================================================| by largest value of ID: NA: 0.00% ( 0.00, 30.85) - 3863 rows NA: 0.00% ( 0.00, 30.85) - 3863 rows NA: 0.00% ( 0.00, 30.85) - 3863 rows NA: 0.00% ( 0.00, 30.85) - 3863 rows Time elapsed: 0 h 3 m 48 s

Trying to manually increase the sample size also returns zero values and error messages: "subscript out of bounds": blues.m1.ext <- extend(blues.m1, along="ID", n=40000) powerSim(blues.m1.ext, fixed("Group:Form", "z"), nsim=10)

Warning: This appears to be an "observed power" calculationPower for predictor 'Group:Form', (95% confidence interval):========================================================================| 0.00% ( 0.00, 30.85) Test: z-test Based on 10 simulations, (0 warnings, 10 errors) alpha = 0.05, nrow = 40000 Time elapsed: 0 h 9 m 20 s nb: result might be an observed power calculation

I am really not sure what is causing these errors; I have looked at a number of previously posted issues, but wasn't able to find a fix. Any idea what the problem may be? (Should you wish to replicate the error messages, the data can be accessed here: https://doi.org/10.18710/RVLFXB)

Thanks in advance!

Romeo

pitakakariki commented 1 year ago

You probably need something like fixed("Group:Form", "z") or fcompare(~Group + Form).

romeodetimmerman commented 1 year ago

I had already tried your first suggestion before, but your second one did in fact work; thank you for pointing out the + versus * difference. I am wondering whether the power analysis itself is accurate though. After running the following line for a little under 2 hours, I got no errors, but a power level of 100%: powerSim(blues.m1, test = fcompare(~Group + Form), nsim = 1000)

Warning: This appears to be an "observed power" calculationPower for model comparison, (95% confidence interval):==================================================================================================================================================| 100.0% (99.63, 100.0) Test: Likelihood ratio Comparison to ~Group + Form + [re] Based on 1000 simulations, (36 warnings, 0 errors) alpha = 0.05, nrow = 3863 Time elapsed: 1 h 45 m 50 s nb: result might be an observed power calculation

pitakakariki commented 1 year ago

The dataset is large enough that high power isn't surprising. I suspect if you test the interaction term in blues.m1 (e.g. with doTest(blues.m1, fcompare(~ Group + Form))) you'll get a very small p-value? That would suggest the effect size is large relative to the sample size.

romeodetimmerman commented 1 year ago

Yes, it does return a very small p-value (see below). Thanks for clearing that up.

p-value for model comparison: 1.077238e-12

Test: Likelihood ratio Comparison to ~Group + Form + [re]

One final question perhaps: any idea what is going wrong in the following powerCurve command? (I realize that this test is quite redundant now, but I'd still like to know how to run it for future reference) powerCurve(blues.m1, along = "ID", test = fcompare(~ Group + Form), nsim = 10, breaks=c(4000, 16000, 32000, 40000))

Warning: This appears to be an "observed power" calculationPower for model comparison, (95% confidence interval),==================================================================================================================================================| by largest value of ID: NA: 100.0% (69.15, 100.0) - 3863 rows NA: 100.0% (69.15, 100.0) - 3863 rows NA: 100.0% (69.15, 100.0) - 3863 rows NA: 100.0% (69.15, 100.0) - 3863 rows Time elapsed: 0 h 4 m 17 s

The following manual approach to increase the dataset along ID does seem to work, so I'm not sure what's going on here. blues.m1.ext <- extend(blues.m1, along="ID", n=40000) powerSim(blues.m1.ext, test = fcompare(~ Group + Form), nsim = 10)

Warning: This appears to be an "observed power" calculationPower for model comparison, (95% confidence interval):==================================================================================================================================================| 100.0% (69.15, 100.0) Test: Likelihood ratio Comparison to ~Group + Form + [re] Based on 10 simulations, (0 warnings, 0 errors) alpha = 0.05, nrow = 40000 Time elapsed: 0 h 10 m 35 s nb: result might be an observed power calculation

pitakakariki commented 1 year ago

Did you mean to run powerCurve on blues.m1.ext?

romeodetimmerman commented 1 year ago

Apologies, I thought powerCurve could itself extend an existing model. I just ran powerCurve(blues.m1.ext, along = "ID", test = fcompare(~ Group + Form), nsim = 10, breaks=c(4000, 16000, 32000, 40000)) and received the following result, so I think everything is now working as intended. Thanks for your help!

Warning: This appears to be an "observed power" calculationPower for model comparison, (95% confidence interval),=============================================================================| by largest value of ID: 4000: 100.0% (69.15, 100.0) - 4000 rows 16000: 100.0% (69.15, 100.0) - 16000 rows 32000: 100.0% (69.15, 100.0) - 32000 rows 40000: 100.0% (69.15, 100.0) - 40000 rows Time elapsed: 0 h 24 m 41 s