pitakakariki / simr

Power Analysis of Generalised Linear Mixed Models by Simulation
68 stars 19 forks source link

Power estimates change when variables are Z-scored #244

Closed lucindasisk closed 2 years ago

lucindasisk commented 2 years ago

Hi! First off, thanks so much for your work creating and maintaining this package--it is an amazing resource! I was aiming to fit a relatively simple model with simulated data but found that the power estimates changed when I z-scored the variables. Would you happen to have insight on what might be going on here? Thanks so much for any help! Here is a reproducible example:

## Without standardization
data(iris)

example_lm <- eval(parse(text="glm(Sepal.Length~Species + Sepal.Width + Petal.Width, 
              data=iris, 
              family='gaussian')"))
summary(example_lm)

coef(example_lm)["Speciesversicolor"] <- 0.3 #Change approximate effect size

powerSim(example_lm, test = fixed("Speciesversicolor", method = "t"), nsim=100, 
         progress=TRUE, seed=0)

#Power for predictor 'Speciesversicolor', (95% confidence interval):=======================|
#       19.00% (11.84, 28.07)
# 
# Test: t-test
# 
# Based on 100 simulations, (0 warnings, 0 errors)
# alpha = 0.05, nrow = 150
## With Standardization

iris_std <- iris %>% 
  mutate(Sepal.Length = scale(Sepal.Length, center=TRUE, scale = TRUE)) %>% 
  mutate(Sepal.Width = scale(Sepal.Width, center=TRUE, scale = TRUE)) %>% 
  mutate(Petal.Width = scale(Petal.Width, center=TRUE, scale = TRUE))

example_lm2 <- eval(parse(text="glm(Sepal.Length~Species + Sepal.Width + Petal.Width, 
              data=iris_std, 
              family='gaussian')"))
summary(example_lm2)

coef(example_lm2)["Speciesversicolor"] <- 0.3 #Change approximate effect size

powerSim(example_lm2, test = fixed("Speciesversicolor", method = "t"), nsim=100, 
         progress=TRUE, seed=0)

# Power for predictor 'Speciesversicolor', (95% confidence interval):=======================|
#      12.00% ( 6.36, 20.02)
#
# Test: t-test
#
# Based on 100 simulations, (0 warnings, 0 errors)
# alpha = 0.05, nrow = 150
pitakakariki commented 2 years ago

You've scaled the response variable Sepal.Length as well as the predictors. This means that your effect size of 0.3 has a different meaning in the two examples.

lucindasisk commented 2 years ago

I see, thanks so much for your help!

pitakakariki commented 2 years ago

While you're here - I'm curious about the code idiom:

example_lm <- eval(parse(text="glm(Sepal.Length~Species + Sepal.Width + Petal.Width, 
          data=iris, 
          family='gaussian')"))

Was there any reason it couldn't be simply:

example_lm <- lm(Sepal.Length ~ Species + Sepal.Width + Petal.Width, data=iris)
lucindasisk commented 2 years ago

Ah, that was an artefact from sifting through the posts on Github and seeing someone who encountered an issue that could be worked around using the eval/parse/text arguments. It runs and produces the same output both ways for me, though. Thanks again!