Closed OmidGhasemi21 closed 2 years ago
It depends on what you mean by "ignore". One interpretation of "ignore" is "the effect, conditional on the average of the ignored predictor" and another is "the effect, averaging out the ignored predictor". In linear models these tend to be the same, so the way you get there doesn't matter too much. In non-linear models these can be different, as taking the average on a linear scale and then putting it through a non-linear transformation is not the same as applying a non-linear transformation and then taking an average.
There are a few approaches off the top of my head. Note that (again because linear model), they all give the same results on this model.
(I'll be using tidybayes::add_epred_draws()
, which is what tidybayes::add_fitted_draws()
was renamed to in the latest version to be more explicit that the result is the expectation of the posterior predictive distribution).
Since you are using sum-to-zero coding with a linear model, one way to get the effect conditional on the average of the ignored predictor is to set the coefficients of the ignored predictor to 0. In brms you can do this by setting the factor to NA
in the data you give to add_epred_draws()
. If you look at the documentation of the newdata
parameter in brms::posterior_epred()
(which is the function tidybayes::add_epred_draws()
calls down it), it says:
NA
values within factors are interpreted as if all dummy variables of this factor are zero. This allows, for instance, to make predictions of the grand mean when using sum coding.
So, we can change the call to tidyr::expand()
to set condition
to NA
, which will mean that add_epred_draws()
(via posterior_epred()
) will zero out the condition
-related coefficients. Thus the result won't have different levels of condition
, and we can pass it to compare_levels()
:
stroop %>%
tidyr::expand(congruency, condition = NA) %>%
add_epred_draws(m1) %>%
compare_levels(.epred, by = congruency) %>%
median_qi()
# A tibble: 1 x 8
congruency condition .epred .lower .upper .width .point .interval
<chr> <lgl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 incongruent - congruent NA 0.106 0.104 0.107 0.95 median qi
Another alternative is to marginalize out (average over) the predictor you don't care about, which involves passing in the full prediction grid to add_epred_draws()
, but then averaging out the levels of condition
. To do that you must group by .draw
and every predictor you aren't marginalizing out and then take the mean:
stroop %>%
tidyr::expand(nesting(congruency, condition)) %>%
add_epred_draws(m1) %>%
group_by(.draw, congruency) %>%
summarise(.epred = mean(.epred), .groups = "drop") %>%
compare_levels(.epred, by = congruency) %>%
median_qi()
# A tibble: 1 x 7
congruency .epred .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 incongruent - congruent 0.106 0.104 0.107 0.95 median qi
Note the result is exactly the same as before.
emmeans
I know you said you didn't want to use it, but for comparison (and sanity checking), you can get the same result out of emmeans
:
library(emmeans)
m1 %>%
emmeans( ~ congruency) %>%
contrast(method = "pairwise") %>%
gather_emmeans_draws() %>%
median_qi()
NOTE: Results may be misleading due to involvement in interactions
# A tibble: 1 x 7
contrast .value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 congruent - incongruent -0.106 -0.107 -0.104 0.95 median qi
The result is the same, it just did the comparison in the opposite order (so the sign is flipped).
Wow, this was excellent. Thank you so much @mjskay for your super helpful and clear explanation. And also thanks for your amazing package. it is an absolute gem.
My pleasure!
Hi @mjskay
I was wondering if it is possible to use
compare_levels
withadd_fitted_draws
when we have a model with two predictors? I mean, can we compare levels of A and ignore B iny ~ A * B
? here is a simple reproducible example:The final code gives the "Each row of output must be identified by a unique combination of keys" error.
I am aware that we can do this using
emmeans
,contrasts
, andgather_emmeans_draws
but I prefer the former method as it is easier to transform values before comparing levels.Thank you.