pitakakariki / simr

Power Analysis of Generalised Linear Mixed Models by Simulation
70 stars 20 forks source link

powerCurve issue with multiple fixed effects #197

Open ejbeier opened 3 years ago

ejbeier commented 3 years ago

Hello!

I want to run a power analysis based on the code and data from Brysbaert & Stevens, M. (2018). Power Analysis and Effect Size in Mixed Effects Models : A Tutorial. (https://osf.io/fhrc6/)

When I re-run their original power analysis using a single fixed effect, I reproduce their power curve and it looks fine:

fit1 = lmer(invRT ~ REPETITION + (1|ITEM) + (1|SUBJECT), data=perea3)
pc1 <- powerCurve(fit1, along="SUBJECT", nsim=20, alpha = 0.02)
plot(pc1)

pc1

I want to change their model to include two fixed effects and an interaction. To run powerCurve, I add the "test" parameter to specify what simpler model it should be compared to (following https://humburg.github.io/Power-Analysis/simr_power_analysis.html). However, this gives me a very different power curve that goes from 0 to almost 100 power.

fit2 = lmer(invRT ~ REPETITION*CASE + (1|ITEM) + (1|SUBJECT), data=perea3)
pc2 <- powerCurve(fit2, along="SUBJECT", nsim=20, alpha = 0.02, test = fcompare(y~CASE, method = 'lr'))
plot(pc2)

pc2

I can tell this is wrong, because when I use extend() to simulate having more subjects, powerCurve again returns a similar figure but with 100 power right at the max number of subjects I used (and 0 power at 40 subjects which was the original dataset).

fit3 <- extend(fit2, along="SUBJECT", n=100)
pc3 <- powerCurve(fit3, along="SUBJECT", nsim=20, alpha = 0.02, test = fcompare(y~CASE, method = 'lr'))
plot(pc3)

pc3

I'd really appreciate the help in figuring out why powerCurve is doing that and what I'm doing wrong. Thank you very much!

Notes:

ejbeier commented 3 years ago

A quick update in case this might be useful to others:

After some debugging I found that powerCurve was giving some errors when running the model with the interaction. The errors read "Testing 20 variable lengths differ (found for 'REPETITION')" which points to some issue with the dataset (like for issue #174 )

This might be a problem related to unbalanced data, since some NAs had to be removed from the dataset. This seems also related to issue #195 where powerCurve doesn't match powerSim. I ran powerSim for the model with the interaction with progressively larger numbers of subjects (using extend()) and it gave a more reasonable progression in power as subjects increased, and no errors. (the number of rows matches between powerSim and powerCurve for matching numbers of subjects).

Is it safe to assume powerSim is giving an accurate power estimate with unbalanced data?

pitakakariki commented 3 years ago

I can't replicate the results you're getting, but I suspect the problem is with fcompare(y~CASE) which should probably be either fcompare(invRT~CASE) or just fcompare(~CASE).