Closed zhengchencai closed 3 weeks ago
Do the brms functions like posterior_epred() work?
It might be useful to see a minimal working example using public data
Thanks @vincentarelbundock . I guess the problem is whether the prediction returns the mean of beta or the distribution of beta itself. posterior_epred() and predictions() gives the expectation of beta, whereas posterior_predict() and predicted_draws() return the beta distribution (before taking the expectation). This is a nice explaination https://www.andrewheiss.com/blog/2022/09/26/guide-visualizing-types-posteriors/images/beta.pdf.
Here is an example adapted from https://www.andrewheiss.com/blog/2021/11/08/beta-regression-guide/#zero-inflated-beta-regression-bayesian-style
Is there a way to tell predictions() not taking expectation? Like for logistic regression it will return 0 and 1 values instead of the proportation of 1.
library(tidyverse) # ggplot, dplyr, %>%, and friends
library(brms) # Bayesian modeling through Stan
library(tidybayes) # Manipulate Stan objects in a tidy way
library(broom) # Convert model objects to data frames
library(vdemdata) # Use data from the Varieties of Democracy (V-Dem) project
library(ggdist) # Special geoms for posterior distributions
library(gghalves) # Special half geoms
library(ggbeeswarm) # Special distribution-shaped point jittering
library(ggrepel) # Automatically position labels
library(patchwork) # Combine ggplot objects
library(scales) # Format numbers in nice ways
library(marginaleffects) # Calculate marginal effects for regression models
library(modelsummary) # Create side-by-side regression tables
set.seed(1234) # Make everything reproducible
# Define the goodness-of-fit stats to include in modelsummary()
gof_stuff <- tribble(
~raw, ~clean, ~fmt,
"nobs", "N", 0,
"r.squared", "R²", 3
)
# Custom ggplot theme to make pretty plots
# Get the font at https://fonts.google.com/specimen/Barlow+Semi+Condensed
theme_clean <- function() {
theme_minimal(base_family = "Barlow Semi Condensed") +
theme(panel.grid.minor = element_blank(),
plot.title = element_text(family = "BarlowSemiCondensed-Bold"),
axis.title = element_text(family = "BarlowSemiCondensed-Medium"),
strip.text = element_text(family = "BarlowSemiCondensed-Bold",
size = rel(1), hjust = 0),
strip.background = element_rect(fill = "grey80", color = NA))
}
# Make labels use Barlow by default
update_geom_defaults("label_repel", list(family = "Barlow Semi Condensed"))
# Format things as percentage points
label_pp <- label_number(accuracy = 1, scale = 100,
suffix = " pp.", style_negative = "minus")
label_pp_tiny <- label_number(accuracy = 0.01, scale = 100,
suffix = " pp.", style_negative = "minus")
vdem_clean <- vdem %>%
select(country_name, country_text_id, year, region = e_regionpol_6C,
polyarchy = v2x_polyarchy, corruption = v2x_corr,
civil_liberties = v2x_civlib, prop_fem = v2lgfemleg, v2lgqugen) %>%
filter(year >= 2010, year < 2020) %>%
drop_na(v2lgqugen, prop_fem) %>%
mutate(quota = v2lgqugen > 0,
prop_fem = prop_fem / 100,
polyarchy = polyarchy * 100)
model_beta_zi <- brm(
bf(prop_fem ~ quota,
phi ~ quota,
zi ~ quota),
data = vdem_clean,
family = zero_inflated_beta(),
chains = 4, iter = 2000, warmup = 1000,
cores = 4, seed = 1234
)
# predicted_draws from tidybayes
beta_zi_pred <- model_beta_zi %>%
predicted_draws(newdata = tibble(quota = c(FALSE, TRUE)))
sum(beta_zi_pred$.prediction[beta_zi_pred$quota == FALSE] == 0)
hist(beta_zi_pred$.prediction[beta_zi_pred$quota == FALSE], breaks = 100)
# posterior_predict
beta_zi_pred <- model_beta_zi %>%
posterior_predict(newdata = tibble(quota = c(FALSE, TRUE)))
beta_zi_pred <- beta_zi_pred[,1]
sum(beta_zi_pred == 0)
hist(beta_zi_pred, breaks = 100)
# posterior_epred
beta_zi_pred <- model_beta_zi %>%
posterior_epred(newdata = tibble(quota = c(FALSE, TRUE)))
beta_zi_pred <- beta_zi_pred[,1]
sum(beta_zi_pred == 0)
hist(beta_zi_pred, breaks = 100)
# predictions
beta_zi_pred <- model_beta_zi %>%
predictions(newdata = tibble(quota = c(FALSE, TRUE))) %>%
posterior_draws()
sum(beta_zi_pred$draw == 0)
hist(beta_zi_pred$draw[beta_zi_pred$quota == FALSE], breaks = 100)
Have you tried changing the type
argument? This is documented at the very top of the Bayesian vignette:
Oh yes type = "prediction" works. Thank you very much.
Hi there,
I wonder if zero_inflated_beta() in brms is supported? I tried avg_predictions() and predictions(), none of them provided 0 value predications. predicted_draws() function from tidybayes gave 0 value predictions. The distribution of the prediction from the former look like Gaussion, whereas the one from predicted_draws() look like beta distributions. Maybe there is a trick to use margnaleffects on zero_inflated_beta model? Thanks a lot.