mjskay / tidybayes

Bayesian analysis + tidy data + geoms (R package)
http://mjskay.github.io/tidybayes
GNU General Public License v3.0
709 stars 59 forks source link

Multiplicative contrasts with gather_emmeans_draws #311

Closed zacrobinson5 closed 11 months ago

zacrobinson5 commented 11 months ago

have done my best to look for answers here but have come up short.

I have a cumulative link model fit with brms that I am attempting to get the entire marginal posterior distribution using the combination of emmeans / tidybayes. However, each time I get the contrast on the response scale, it goes from multiplicative to additive after using "gather_emmeans_draws" - please let me know what I'm doing wrong / misunderstanding! Thank you very much!

model = brm(data = df, y ~ x , family = cumulative("logit"))

grid= emmeans(model, ~x, type="response")%>% contrast(method="revpairwise") (multiplicative contrast)

draws = gather_emmeans_draws(grid)%>% mode_hdi() (now additive contrast)

mjskay commented 11 months ago

Hmm, can you give me an example dataset and the expected output? That would make it easier to debug. Thanks!

zacrobinson5 commented 11 months ago

@mjskay thanks for the reply! here is some code that should highlight the disparity. If you take a look at the contrasts (odds ratios) in the grid object they are much different than the contrasts (differences) in the draws object. Let me know if you need anything else!

library(tidyverse) library(brms) library(emmeans) library(tidybayes)

data("mtcars") data=mtcars%>% mutate_at("gear",factor)

model=brm(data = data, family = bernoulli(), vs~gear, seed = 123)

grid=emmeans(model,revpairwise~gear, type="response")$contrasts

draws=gather_emmeans_draws(grid)%>% median_hdi()

mjskay commented 11 months ago

Ah, I see the issue. It appears that emmeans does not apply the inverse link function until the output is displayed, so internally it is storing the log odds, and then it exponentiates them before display to show the odds ratios.

Unfortunately it looks like the function it uses to determine what transformation to apply, emmeans:::.make.link(), is not exported from emmeans. That means tidybayes can't really be updated to handle these cases robustly, because it is not generally a good idea to write packages that rely on other packages unexported functions (and even if I thought that was a good idea, CRAN would complain if I tried to do it).

However, in this particular case it is easy enough to do manually. One way would be to use the known inverse link for this model, which is exp(). Something like:

grid |> 
  gather_emmeans_draws() |> 
  mutate(.value = exp(.value)) |> 
  median_hdci()
## # A tibble: 3 × 7
##   contrast       .value     .lower  .upper .width .point .interval
##   <chr>           <dbl>      <dbl>   <dbl>  <dbl> <chr>  <chr>    
## 1 gear4 - gear3 26.4    1.32       188.      0.95 median hdci     
## 2 gear5 - gear3  0.801  0.000186     7.27    0.95 median hdci     
## 3 gear5 - gear4  0.0299 0.00000604   0.264   0.95 median hdci 

(I'm using hdci instead of hdi because I'd expect these to be continuous intervals, and hdci will be more robust)

If you want code that should be robust to different link functions, you can do something like this, bearing in mind it relies on an internal (unexported) function from emmeans:

# this ugly figures out the inverse link function from the grid object
linkinv = emmeans:::.make.link(grid@misc$tran)$linkinv
grid |> 
  gather_emmeans_draws() |> 
  mutate(.value = linkinv(.value)) |> 
  median_hdci()
## # A tibble: 3 × 7
##   contrast       .value     .lower  .upper .width .point .interval
##   <chr>           <dbl>      <dbl>   <dbl>  <dbl> <chr>  <chr>    
## 1 gear4 - gear3 26.4    1.32       188.      0.95 median hdci     
## 2 gear5 - gear3  0.801  0.000186     7.27    0.95 median hdci     
## 3 gear5 - gear4  0.0299 0.00000604   0.264   0.95 median hdci   

Does that help?

zacrobinson5 commented 11 months ago

@mjskay Perfect, thank you so much!

mjskay commented 11 months ago

glad to help!