Closed rpmady closed 1 year ago
Please take a look at a similar example:
require(emmeans)
## Loading required package: emmeans
# This example uses the 'mtcars' dataset provided with R
mod <- lm(mpg ~ factor(cyl) * disp, data = mtcars)
emmeans(mod, ~ disp * cyl, at = list(disp = c(50, 100, 200, 400)))
## disp cyl emmean SE df lower.CL upper.CL
## 50 4 34.1 1.697 26 30.63 37.60
## 100 4 27.4 0.729 26 25.86 28.86
## 200 4 13.8 2.742 26 8.21 19.48
## 400 4 -13.2 8.260 26 -30.16 3.79
## 50 6 19.3 3.232 26 12.62 25.91
## 100 6 19.4 2.138 26 15.05 23.84
## 200 6 19.8 0.977 26 17.79 21.81
## 400 6 20.5 5.127 26 9.99 31.06
## 50 8 21.1 3.009 26 14.87 27.24
## 100 8 20.1 2.537 26 14.85 25.28
## 200 8 18.1 1.615 26 14.79 21.43
## 400 8 14.2 0.780 26 12.58 15.78
##
## Confidence level used: 0.95
emmeans(mod, "cyl", at = list(disp = c(50, 100, 200, 400)))
## NOTE: Results may be misleading due to involvement in interactions
## cyl emmean SE df lower.CL upper.CL
## 4 15.5 2.407 26 10.6 20.5
## 6 19.8 0.902 26 17.9 21.6
## 8 18.4 1.728 26 14.8 21.9
##
## Results are averaged over the levels of: disp
## Confidence level used: 0.95
Created on 2023-02-08 with reprex v2.0.2
The first emmeans()
call is like yours, in that it asks for combinations of the factor levels and the specified covariate values. Note that there is indeed no annotation that means were averaged over that covariate, precisely because they weren't. All the combinations of those two variables are shown, not averaged-over.
The second emmeans()
call just specifies the one factor, and the output does average over the four disp
values provided, and it says so in the annotations below the output.
Maybe you meant to set levels for covariate2
instead of/as well? Anyway, I don't see a bug here.
Oh, another discrepancy is your cov.reduce = range
. That affects both covariates. So your emmeans()
call with at
but not cov.reduce
will set covariate2
at its mean (one value), rather than its min and max (2 values).
Based on this, the comparison case would be
emmeans(model, ~ factor1* covariate1, at=list(covariate1=c(1,2,3,4,5)), cov.reduce = range, type = "response")
Thanks for demonstrating that with the mtcars dataset. I'll clarify what I meant by also using the mtcars
dataset.
I modified the model in the example you provided:
mod <- lm(mpg ~ factor(cyl) * disp + hp + drat, data = mtcars)
When I calculate the emmeans at specific values of disp
, I expected to see in the output that it was averaged over the other predictor variables (hp
and drat
) but I don't:
emmeans(mod, ~ disp * cyl, at = list(disp = c(50, 100, 200, 400)))
## cyl emmean SE df lower.CL upper.CL
## 4 15.0 2.446 24 9.93 20.0
## 6 19.4 0.954 24 17.39 21.3
## 8 19.5 1.927 24 15.54 23.5
##Results are averaged over the levels of: disp
##Confidence level used: 0.95
Are the other variables hp
and drat
being set at their mean and it is just not shown?
hp and drat are not in the model, and play no role whatsoever in the analysis.
Apologies, I'm not following. I did put hp and drat in the example I just posted.
mod <- lm(mpg ~ factor(cyl) * disp + hp + drat, data = mtcars)
When calculating emmeans, do you need to also include them like this:
emmeans(mod, ~ disp * cyl + hp + drat, at = list(disp = c(50, 100, 200, 400)))
Sorry, I was trying to answer too quickly. What emmeans()
does is based on the reference grid:
> mod <- lm(mpg ~ factor(cyl) * disp + hp + drat, data = mtcars)
> ref_grid(mod, at = list(disp = c(50, 100, 200, 400)))
'emmGrid' object with variables:
cyl = 4, 6, 8
disp = 50, 100, 200, 400
hp = 146.69
drat = 3.5966
When you have covariates, they are by default reduced to their average. Hence only one value of those covariates, hence they are not "averaged over." If you reduce them to two or more values, e.g.
> ref_grid(mod, at = list(disp = c(50, 100, 200, 400)), cov.reduce = range)
'emmGrid' object with variables:
cyl = 4, 6, 8
disp = 50, 100, 200, 400
hp = 52, 335
drat = 2.76, 4.93
then they do show up in the annotations of what is averaged over
> emmeans(mod, "cyl", at = list(disp = c(50, 100, 200, 400)), cov.reduce = range)
NOTE: Results may be misleading due to involvement in interactions
cyl emmean SE df lower.CL upper.CL
4 14.4 2.60 24 9.05 19.8
6 18.8 1.25 24 16.21 21.4
8 19.0 1.89 24 15.06 22.9
Results are averaged over the levels of: disp, hp, drat
Confidence level used: 0.95
Okay, I follow 😊 Thank you for clarifying and for your patience (especially since this was not a bug afterall, and my own misunderstanding).
Describe the bug
When I calculate emmeans using at=list(), results are averaged across one other predictor instead of all the predictors.
To reproduce
I have run a glm model using the glmmTMB package with five predictors. Two are binary categorical variables (
factor1
,factor2
), one is a proportion (prop
) , and the other two are continuous (covariate1
,covariate2
).In the model I have the interaction term categorical1*continuous1, and I want to calculate the emmeans, so I used at=list().
emmeans(model, ~ factor1* covariate1, at=list(covariate1=c(1,2,3,4,5)), type = "response")
The output shows me emmeanas with "Results are averaged over the levels of: factor2"Expected behavior
I expected the emmean results to be averaged over the rest of the predictor variables. I could only get the emmeans to be averaged over the other predictor variables by using the following code:
emmeans(model, ~ factor1* covariate1, cov.reduce = range, type = "response")
The output showed me emmeans with "Results are averaged over factor2, prop, covariate2From the documentation and vignettes, I expected the list to just allow me to get emmeans at multiple values for the continuous variable, and not exclude averaging across the other variables.