rvlenth / emmeans

Estimated marginal means
https://rvlenth.github.io/emmeans/
341 stars 30 forks source link

EMMs by different levels of random effects (grouping factor) #118

Closed strengejacke closed 5 years ago

strengejacke commented 5 years ago

I would like to get EMMs at different values from random effects (grouping factors), similar to what you can get from predict(). See very last example here.

Currently, emmeans() does not allow to specify a random effects group factor (unless I missed something). Do you think it's possible to implement such a feature?

library(lme4)
#> Loading required package: Matrix
library(emmeans)

data(sleepstudy)
m <- lmer(Reaction ~ Days + (1 + Days | Subject), data = sleepstudy)

emmeans(m, "Days")
#>  Days emmean   SE df lower.CL upper.CL
#>   4.5    299 9.05 17      279      318
#> 
#> Degrees-of-freedom method: kenward-roger 
#> Confidence level used: 0.95

emmeans(m, c("Days", "Subject"))
#> Error in emmeans(m, c("Days", "Subject")): No variable named Subject in the reference grid
emmeans(m, "Days", by = "Subject")
#> Error in emmeans(m, "Days", by = "Subject"): No variable named Subject in the reference grid
emmeans(m, "Days", list("Subject" = c(308, 333, 350)))
#> Error in RG@levels[[f]]: no such index at level 1

Created on 2019-06-21 by the reprex package (v0.3.0)

rvlenth commented 5 years ago

Currently, the design of emmeans does not allow random effects to be included in the reference grid.

Conceptually, when one wants predictions at different levels of a random factor one is (at least temporarily) regarding that factor as having fixed levels of interest. So my suggestion is to fit another model that treats that factor as fixed. In your example, it would be

> m2 <- lm(Reaction ~ Days + Subject + Days:Subject, data = lme4::sleepstudy)

> emmeans(m2, ~ Days*Subject)
 Days Subject emmean   SE  df lower.CL upper.CL
  4.5 308        342 8.09 144      326      358
  4.5 309        215 8.09 144      199      231
  4.5 310        231 8.09 144      215      247
  4.5 330        303 8.09 144      287      319
  4.5 331        309 8.09 144      293      325
  4.5 332        307 8.09 144      291      323
  4.5 333        316 8.09 144      300      332
  4.5 334        295 8.09 144      279      311
  4.5 335        250 8.09 144      234      266
  4.5 337        376 8.09 144      360      392
  4.5 349        276 8.09 144      260      292
  4.5 350        314 8.09 144      298      330
  4.5 351        290 8.09 144      274      306
  4.5 352        337 8.09 144      321      353
  4.5 369        306 8.09 144      290      322
  4.5 370        292 8.09 144      276      308
  4.5 371        295 8.09 144      279      311
  4.5 372        318 8.09 144      302      334

Confidence level used: 0.95 

> emmip(m2, Subject ~ Days, cov.reduce = FALSE)

image

This does not mean that m2 has to be your model for other follow-up analyses, but for purposes of predicting separately for each subject, it is the one that makes sense.

strengejacke commented 5 years ago

Thanks a lot! I'll give it a try. You can close this issue if you like, unless you want to come back to this issue somehow in the future.

rvlenth commented 5 years ago

OK, will close. BTW, since you are developer of ggpredict, you may be interested in my recent addition to emmeans of rudimentary provisions for prediction intervals. Also, bias corrrections of back-transformed means.

Unfortunately, I screwed up (but will blame RStudio's update) with inter-links in the vignettes. I'm going to see if I can resolve this and plead with CRAN to let me update soon.

strengejacke commented 5 years ago

BTW, since you are developer of ggpredict, you may be interested in my recent addition to emmeans of rudimentary provisions for prediction intervals. Also, bias corrrections of back-transformed means.

Yes, we were talking via email about this. I have read the vignette, and coming back to that vignette just now, I saw that there was exact the same approach how to deal with this issue as you suggested above.

There was also a discussion on the R mixed-models list on prediction intervals, I've linked to that vignette there as well (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2019q2/027970.html).

rvlenth commented 5 years ago

Duh, I should have remembered! My excuse is summer travel; been sort of in and out of communication on R things.

Russ

Sent from my iPhone

On Jun 21, 2019, at 11:38 AM, Daniel notifications@github.com<mailto:notifications@github.com> wrote:

BTW, since you are developer of ggpredict, you may be interested in my recent addition to emmeans of rudimentary provisions for prediction intervals. Also, bias corrrections of back-transformed means.

Yes, we were talking via email about this. I have read the vignette, and coming back to that vignette just now, I saw that there was exact the same approach how to deal with this issue as you suggested above.

There was also a discussion on the R mixed-models list on prediction intervals, I've linked to that vignette there as well (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2019q2/027970.html).

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/rvlenth/emmeans/issues/118?email_source=notifications&email_token=AGMJPL3G2HKQCLTHXK2MIHTP3T7W5A5CNFSM4H2NJZGKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYI6V5I#issuecomment-504490741, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AGMJPL4WVCI7CSML6PFNHTTP3T7W5ANCNFSM4H2NJZGA.

OmidGhasemi21 commented 11 months ago

Hi both,

I was wondering if you have found a solution (possibly a package) for this issue?

Besides what Russ suggested, is not it possible to manually add the fixed estimate to the ranef values to get the coef for random effect groups?

Sorry if it is not relevant to emmeans, but I was wondering if adding the value for the predictor to the ranef() and their CIs makes sense.

library(lme4)
library(tidyverse)

data <- read.csv("https://vincentarelbundock.github.io/Rdatasets/csv/palmerpenguins/penguins.csv") %>%
  drop_na()

model <- lmer(flipper_length_mm ~ bill_depth_mm + (bill_depth_mm | species),
              data = data)

broom.mixed::tidy(model, effects = "ran_vals", conf.int = TRUE)%>%
  filter(term == "bill_depth_mm") %>%
  # add the fixed coefficient  to ranef values to get coefs
  mutate(estimate = estimate + 3.24,
         conf.low = conf.low + 3.24,
         conf.high = conf.high + 3.24)  %>%
  select(species=level, estimate, std.error, conf.low, conf.high) %>%
  mutate(ci_width= conf.high - conf.low)
# species   estimate std.error conf.low conf.high ci_width
# 1 Adelie        1.84     0.326     1.20      2.48     1.28
# 2 Chinstrap     3.14     0.433     2.29      3.99     1.70
# 3 Gentoo        4.74     0.434     3.89      5.60     1.70
rvlenth commented 11 months ago

It doesn't look like you're accounting for the estimation errors in the fixed effects. So your confidence intervals are too narrow.

OmidGhasemi21 commented 11 months ago

Thanks @rvlenth for clarification. So manual calculation of CI (at least the way that I did) is not straightforward. Does it mean that there are no ways in getting those group-level values? In that case, regarding the random intercept as fixed (e.g., bill_depth_mm * species in the example above) is the best available option. Thanks.

rvlenth commented 11 months ago

It's probably possible, but what you have is a linear combination of the fixed effects plus a linear combination of the random effects, so the calculation involves the covariance matrix of each. Plus, it's not clear that those estimates are correlated, but if so, you need the covariance matrix for all the effects together. It may be possible to get these using the predict() function for merMod models, but I am not sure. This is not something I am interested in in incorporating in the emmeans package, but maybe somebody else is interested and can help.