`avg_slopes` difference between `glm` and `brm` (binomial family)

dipetkov commented 1 year ago

This is a follow-up to issue https://github.com/vincentarelbundock/marginaleffects/issues/813 about using the aggregate binomial model to get marginal estimates that match those obtained by fitting a model on the original (0s and 1s) data.

When the model is fitted with glm, a solution is to specify the weights: wts = n / sum(n). This solution doesn't seem to work for a binomial regression fitted with brm. (In that case the marginal slope can be reproduced with newdata = original_data.)

Here is a reprex.

library("palmerpenguins")
library("brms")
library("marginaleffects")
library("tidyverse")

data_01s <- penguins %>%
  mutate(
    x = body_mass_g,
    y = 1L * (sex == "male"),
    n = 1L
  ) %>%
  drop_na()
data_agg <- data_01s %>%
  summarise(
    y = sum(y),
    n = n(),
    .by = x
  )

# Two equivalent specifications of the same model
m_01s <- brm(
  y ~ x,
  family = bernoulli(),
  data = data_01s
)
m_agg <- brm(
  y | trials(n) ~ x,
  family = binomial(),
  data = data_agg
)

# Will get the same estimate 3 different ways if the model is fitted with `glm`
# m_01s <- glm(
#   y ~ x,
#   family = binomial(),
#   data = data_01s
# )
# m_agg <- glm(
#   cbind(y, n - y) ~ x,
#   family = binomial(),
#   data = data_agg
# )

# Same estimate on the logit scale without weights
avg_slopes(m_01s, type = "link")
#>  Term Estimate    2.5 %  97.5 %
#>     x  0.00125 0.000917 0.00161
#> 
avg_slopes(m_agg, type = "link")
#>  Term Estimate    2.5 % 97.5 %
#>     x  0.00125 0.000929 0.0016

avg_slopes(m_01s)
#>  Term Estimate    2.5 %   97.5 %
#>     x 0.000254 0.000205 0.000295
#> 
avg_slopes(m_agg, wts = data_agg$n / sum(data_agg$n)) # <- Different estimate
#>  Term Estimate   2.5 % 97.5 %
#>     x  0.00143 0.00112 0.0017
#> 
avg_slopes(m_agg, newdata = data_01s)
#>  Term Estimate    2.5 %   97.5 %
#>     x 0.000254 0.000204 0.000296

vincentarelbundock commented 1 year ago

Thanks a lot for the detailed bug report. I confirm that I see the same issue on my computer.

I'll look at this as soon as I find some time.

Notes to self:

dydx does not appear to be flipped to dydxavgwts when wts is given.
As a result, the slope estimates are only averaged at the very end here: https://github.com/vincentarelbundock/marginaleffects/blob/main/R/get_contrasts.R#L418

vincentarelbundock commented 1 year ago

OK, I don't think this is a marginaleffects problem after all. I'm not super familiar with brms, but it looks like the brms::posterior_epred() function we use to make predictions returns different results depending on the model object and newdata structure.

Below, I illustrate this with simple predictions. I show how to get results using raw brms commands and also with avg_predictions(). As you will see, the results with the binomial or standard formulations are not equivalent, but the marginaleffects() functions produce OK results. After those examples, I show you three equivalences with slopes.

I also close this issue because I am not convinced that this is a bug; it has to do with brms. But feel free to keep the discussion going if you have more insights to bring to bear on the matter.

w <- data_agg$n / sum(data_agg$n)

# correct
avg_predictions(m_01s)

> Estimate 2.5 % 97.5 %
>    0.505 0.454  0.555
> 
> Columns: estimate, conf.low, conf.high 
> Type:  response 

posterior_epred(m_01s, newdata = data_01s) |>
  apply(1, mean) |>
  median()
> [1] 0.50465

# also correct, but the `brms` models do something different under the hood
avg_predictions(m_agg, newdata = data_agg, wts = w)
>  Estimate 2.5 % 97.5 %
>      2.53  2.26    2.8
> Columns: estimate, conf.low, conf.high 
> Type:  response 

posterior_epred(m_agg, newdata = data_agg) |>
  apply(1, weighted.mean, w = w) |>
  median()
> [1] 2.5333

# correct (no weights)
avg_predictions(m_agg, newdata = data_agg)
>  Estimate 2.5 % 97.5 %
>      1.81  1.64   1.98
> 
> Columns: estimate, conf.low, conf.high 
> Type:  response 

posterior_epred(m_agg, newdata = data_agg) |>
  apply(1, mean) |>
  median()
> [1] 1.8121

Now three examples of different syntaxes that produce the same results with avg_comparisons():

avg_slopes(m_01s)

avg_slopes(m_agg, newdata = data_01s)

avg_slopes(m_01s, newdata = data_agg, wts = w)

dipetkov commented 1 year ago

I'm not re-opening this issue as there is no error in either brms or marginaleffects; just providing an explanation about what I think is going on.

With posterior_epred(m_agg), the posterior sample is a vector $[n_i\tilde{p}_i]$ where $n_i$ is the sample size and $\tilde{p}_i$ is the posterior probability. So the sample sizes are weights that have already been applied.

Here is how to reproduce the avg_prediction with the aggregate data.

n <- data_agg$n

avg_predictions(m_agg, newdata = data_01s)
#>  Estimate 2.5 % 97.5 %
#>     0.504 0.459  0.549
posterior_epred(m_agg, newdata = data_agg) |>
  # The weights {n_i} have already been applied, so we only need to normalize by sum(n_i).
  apply(1, \(x) sum(x) / sum(n)) |>
  median()
#> [1] 0.5035349

I think this also means that there are no weights w such that avg_slopes(m_agg, wts = w) reproduce the marginal effect correctly. Instead, when we are working with the aggregate model, we can use one of these two methods:

# Model `m_agg` uses weights. In the original, un-collapsed dataset
# each observation is a subset of sample size of 1.
avg_slopes(m_agg, newdata = data_01s %>% mutate(n = 1))
#>  Term Estimate    2.5 %   97.5 %
#>     x 0.000255 0.000204 0.000297

# The sample sizes `n` get in the way of the `posterior_epred` calculation.
# So "turn those off" by setting `n = 1` and use the weights argument instead.
avg_slopes(m_agg, newdata = data_agg %>% mutate(n = 1), wts = data_agg$n)
#>  Term Estimate    2.5 %   97.5 %
#>     x 0.000255 0.000204 0.000297

Sorry about the false bug report & thank you for helping me to figure this out.

vincentarelbundock commented 1 year ago

This is super useful. Thanks for coming around and clarifying. This was not a straightforward issue...

vincentarelbundock / marginaleffects

`avg_slopes` difference between `glm` and `brm` (binomial family) #913