Open GidonFrischkorn opened 2 months ago
Hey @GidonFrischkorn ,
I've attempted to do this in the past, but ran into several issue.
The first issue is that brms
does not uniquely sample from non-unique priors.
For example, in the following model, we have proper priors on all the parameters, but because we've globally set a prior on the fixed effects, we only get one vector, while there are two vectors for the corresponding posteriors.
library(brms)
fit <- brm(mpg ~ 0 + Intercept + hp + am, data = mtcars,
prior = set_prior("normal(0, 20)"),
sample_prior = TRUE,
refresh = 0)
as.data.frame(fit) |> head()
#> b_Intercept b_hp b_am sigma prior_b prior_sigma lprior lp__
#> 1 24.73046 -0.05237320 6.167779 2.570381 6.852937 6.559109 -14.69582 -93.34539
#> 2 24.88039 -0.05116802 6.637926 2.344110 -18.265380 6.615502 -14.68886 -94.43643
#> 3 24.24229 -0.05051330 6.003589 2.606001 15.974921 11.080613 -14.66735 -94.23779
#> 4 25.01586 -0.04454338 6.689296 2.578515 17.692674 6.891727 -14.72284 -97.53541
#> 5 24.85939 -0.05628901 6.130560 2.826893 -20.598982 5.061051 -14.73246 -94.04391
#> 6 26.62790 -0.05369877 5.115034 3.234256 -37.351830 11.546792 -14.88307 -93.46047
# prior_b => b_hp + b_am
It might be possible to match these up in some way (as brms::hypothesis()
seems to do) -- we would also need to shuffle repeating columns to simulate truly orthogonal priors.
Additionally, when a flat prior is used, it is not sampled from (obviously), and so we again get a mismatch between available priors and posteriors. (With improper priors such as this, the marginal likelihood is not defined, so really I think BF computation should fail in such cases.)
fit <- brm(mpg ~ 0 + Intercept + hp + am, data = mtcars,
sample_prior = TRUE,
refresh = 0)
as.data.frame(fit) |> head()
#> b_Intercept b_hp b_am sigma prior_sigma lprior lp__
#> 1 25.84081 -0.05774245 4.515949 3.003870 7.6333134 -2.190473 -80.96692
#> 2 27.12705 -0.05564046 4.279162 3.161670 4.9799842 -2.210536 -80.72393
#> 3 24.41755 -0.05049852 5.888990 2.786568 8.3211920 -2.164224 -80.79550
#> 4 30.00162 -0.07928318 3.861281 2.843180 3.0833194 -2.170906 -82.98043
#> 5 30.76642 -0.08188406 4.548033 3.407905 3.7942492 -2.243454 -83.70852
#> 6 24.69123 -0.04633966 6.058427 2.589791 0.9906569 -2.141885 -81.19557
# ?? => b_hp + b_am
(Also note that the prior of the intercept is also not returned...)
If you have a specific estimate, you have manually set it up from both priors and posteriors, and then pass those (as data frames) to bayesfactor_parameters()
.
fit <- brm(mpg ~ 0 + Intercept + hp + am, data = mtcars,
prior = set_prior("normal(0, 20)"),
sample_prior = TRUE,
refresh = 0)
draws <- as.data.frame(fit)
# prior extimate for b_am - b_hp
prior_diff <- draws$prior_b - sample(draws$prior_b) # we need to shuffle!
# prior extimate for b_am - b_hp
posterior_diff <- draws$b_am - draws$b_hp
bayestestR::bayesfactor_parameters(
posterior_diff,
prior_diff
)
#> Bayes Factor (Savage-Dickey density ratio)
#>
#> BF
#> ------
#> 123.68
#>
#> * Evidence Against The Null: 0
bayestestR::unupdate()
Alternatively, don't set sample_prior = TRUE
, and use bayestestR::unupdate()
to get a priors only model (requires proper priors on all parameters).
fit <- brm(mpg ~ 0 + Intercept + hp + am, data = mtcars,
prior = set_prior("normal(0, 20)"),
sample_prior = "no", # (default)
refresh = 0)
fit_prior <- bayestestR::unupdate(fit)
draws_posterior <- as.data.frame(fit)
draws_prior <- as.data.frame(fit_prior)
all(names(draws_prior) %in% names(draws_posterior))
#> [1] TRUE
# prior extimate for b_am - b_hp
prior_diff <- draws_prior$b_am - draws_prior$b_hp
# prior extimate for b_am - b_hp
posterior_diff <- draws$b_am - draws$b_hp
bayestestR::bayesfactor_parameters(
posterior_diff,
prior_diff
)
#> Bayes Factor (Savage-Dickey density ratio)
#>
#> BF
#> ------
#> 108.92
#>
#> * Evidence Against The Null: 0
Describe the solution you'd like When estimating a Bayesian Regression model using
brms
you can select the optionsample_prior = "yes"
in order to also sample from the priors, in addition to obtaining posterior samples updated by the data. Currently, when having sampled the priors this way, thebayesfactor_parameters
function still samples from the priors again which requires additional waiting time. I wonder if there is an option to change this behavior of thebayesfactor_parameters
function to test if samples from the priors are saved in thebrmsfit
object and if that is the case do not sample from the priors again.How could we do it? I think there should be a simple way to test if a
brmsfit
object contains samples from the priors. And if that is the case these could be passed to thebayesfactor_parameters
function in a similar way as passing priors from an unupdated model via theprior
argument. Maybe I am overlooking something about the way the samples from the prior are stored in thebrmsfit
object. But thehypothesis
function inbrms
requires to include samples from the priors to obtain Bayes Factors, so the prior samples should be accessible in thebrmsfit
object.