mjskay / tidybayes

Bayesian analysis + tidy data + geoms (R package)
http://mjskay.github.io/tidybayes
GNU General Public License v3.0
710 stars 59 forks source link

Use fixed point in `add_epred_draws` #290

Closed jokroese closed 2 years ago

jokroese commented 2 years ago

First of all, thank you @mjskay and the tidybayes team and contributors for creating and maintaining this package. I use it almost every day and it makes my work so much easier.

Second, I am trying to do something a little unusual. I have a model where I want to collapse one of the random effect distributions to a fixed point estimate before calculating the expected predictions (through add_epred_draws). To make this clearer, here is an example:

library(dplyr)
options(mc.cores = parallel::detectCores())

# create the data
data <- tibble(x_1 = rnorm(1000)) |>
  rowwise() |>
  mutate(category = sample(c("a", "b", "c"), size = 1),
         x_2 = case_when(category == "a" ~ 0, category == "b" ~ 1, category == "c" ~ 2),
         y = rnorm(1, 0.5 + x_1 + x_2, 1))

# fit a brms model
m1 <- brms::brm(y ~ 1 + x_1 + (1 | category),
          data = data)

I want to now use something like tidybayes::add_epred_draws(object = m1, etc.) as if the category random effect has 0 variance, i.e. category[i] ~ N(0,0).

I can achieve the effect by creating a separate model that does not have category in it (or equivalently, update the model with a fixed point prior in brms) and then use add_epred_draws. However, in my use case, I have a bigger model where I want to collapse the distribution of each of the random effect variables sequentially. (The idea is to show what the expected value and variation would be if we had 'the average' from each category, ignoring the variation from that category.) To avoid running multiple models, I would prefer to do this within tidybayes after fitting the single model.

Is there a way to do this using tidybayes?

mjskay commented 2 years ago

Yes, you can zero out random effects. tidybayes::add_epred_draws() calls down to brms::posterior_epred(), which has an re_formula parameter for doing stuff like this.

You can set it to NA to zero out all random effects, or you can provide a formula containing just the random effects you want to keep in. e.g. the default in your model would be re_formula = ~ (1|category), but if you replaced it with re_formula = ~ 0 then the category random effect would not be applied to the predictions; something like: tidybayes::add_epred_draws(object = m1, re_formula = ~ 0). In this way you could use re_formula to ignore specific random effects in your predictions as desired without refitting the model.

Hope that helps!

jokroese commented 2 years ago

Spot on, exactly what I was looking for! Thanks so much!

mjskay commented 2 years ago

Great, glad to help!