rvlenth / emmeans

Estimated marginal means
https://rvlenth.github.io/emmeans/
340 stars 30 forks source link

are emtrends adjusted for covariates in the model? #475

Closed karan-ganguly closed 2 months ago

karan-ganguly commented 3 months ago

Hello!

I have a linear mixed model where the independent variables include one factor with 5 levels, and 3 continuous variables. I'm using emtrends to get the slope for one of the continuous IVs for every level of the factor. My question is, are these slopes adjusted for the other continuous variables in the model? Or are they given for the average value of the other variables? If not adjusted, then how can get "partial regression" coefficients for the IV I'm interested in while still using emtrends?

Thanks!

model: lmerTest(DV ~ (1|subject) + Factor_IV + Continuous_IV1 + Continuous_IV2 + Continuous_IV3 + Factor_IV:Continuous_IV1)

rvlenth commented 3 months ago

Assuming you're talking about the continuous predictor that interacts with the factor in your model, emtrends() will completely disregard the other covariates, because they don't interact with either of those.

To get the partial regression coefficients, use fixef(model) (or summary(model) to also get their standard errors). The function of emtrends() is not to provide regression coefficients

karan-ganguly commented 3 months ago

Thank you! But in that case, why do the results look different depending on whether there are covariates the model? I tried to simulate my analysis below.

library(lmerTest)
library(emmeans)

n_subjects <- 500
n_conditions <- 4

set.seed(5763) 
df <- data.frame(
  subject = rep(1:n_subjects, each=n_conditions),
  factor = factor(rep(c("A", "B", "C", "D"), n_subjects)),
  DV = rnorm(n_subjects * n_conditions, mean= 55, sd= 3), 
  IV1 = rnorm(n_subjects * n_conditions, mean=45, sd=4),
  IV2 = rnorm(n_subjects * n_conditions, mean=27, sd=8) 
)

mod<- lmer(DV~ (1|subject) + factor + IV1 + IV2 + factor:IV1, df) 
mod_noCov<- lmer(DV~ (1|subject) + factor + IV1 + factor:IV1, df)  # model with no covariate (IV2 omitted)

trends<-emtrends(mod, "factor",  var="IV1", adjust="none")
trends_noCov<-emtrends(mod_noCov, "factor",  var="IV1", adjust="none") # emtrends from model with no covariate

trends

factor IV1.trend     SE   df lower.CL upper.CL
A         0.0590 0.0337 1991 -0.00707   0.1250
B        -0.0108 0.0337 1991 -0.07683   0.0553
C         0.0253 0.0340 1991 -0.04129   0.0920
D        -0.0269 0.0317 1991 -0.08910   0.0354

trends_noCov

factor IV1.trend     SE   df lower.CL upper.CL
A         0.0580 0.0337 1992 -0.00808   0.1240
B        -0.0108 0.0337 1992 -0.07690   0.0553
C         0.0269 0.0340 1992 -0.03968   0.0935
D        -0.0257 0.0317 1992 -0.08788   0.0366
rvlenth commented 3 months ago

When I said it will completely disregard the other covariates, I was saying their values don't affect the estimated trends. Try:

emtrends(mod, ~factor*IV2",  var="IV1", at = list(IV2=c(20,30), adjust="none")

You will get the same trends for each value of IV2

But all results from emmeans functions depend on the model; and mod and mod_noCov are different models. IV1 and IV2 are probably slightly correlated, and that affects the partial regression coefficients. Compare the coefficients of the two models - they won't be the same.

karan-ganguly commented 3 months ago

ok, I see, thanks a lot!

In that case, I think I have some basic confusion about emtrends. If what I'm interested in is how the effect of IV1 on DV changes at each level of "factor", while controlling for IV2, then that's not what emtrends is for?

In that case, what is the advantage/ purpose of emtrends against summary(mod), other than summary(mod) will by default take one factor level as the intercept?

Sorry, I realize this is a ridiculously basic Q; but I'm not very experienced with statistics, so would really appreciate it if you could clarify.

rvlenth commented 3 months ago

If you look at the model summary, the regression coefficients are something like the following (with default parameterization):

intercept1 (intercept for 1st trend)
intercept2 - intercept1
intercept3 - intercept1
intercept4 - intercept1
slope1 (Slope of 1st trend)
slope2 - slope1
slope3 - slope1
slope4 - slope1

whereas if you use emtrends, you get slope1, slope2, slope3, and slope4 directly.

That seems pretty trivial (but convenient). However, if you have more than one factor, and/or interactions with more covariates, things get a lot harder to decipher.

rvlenth commented 3 months ago

PS I am about to leave on a vacation, so I won't be able to answer more questions for a while. They'll have to wait until I return.

karan-ganguly commented 3 months ago

Thanks very much. That makes perfect sense. It's definitely makes it a lot easier. But then I was not sure because you said that emtrends are not partial regression coefficients. Now I understand that they represent the same slope, but the numbers are not the same because of how regression coefficients are calculated.

Thank you very much! Enjoy our vacation :)

rvlenth commented 2 months ago

I think this issue is resolved, so closing it.