Closed athowes closed 1 week ago
This code I wrote in PR #220 might be useful:
#' The latent lognormal LPDF function
#'
#' To be tested against an exposed Stan version. I think that this would be
#' used in the forthcoming `log_lik_latent_lognormal` function.
#'
#' @param y ...
#' @param mu ...
#' @param sigma ...
#' @param pwindow ...
#' @param swindow ...
#' @param obs_t ...
#' @family postprocess
#' @autoglobal
#' @export
latent_lognormal_lpdf <- function(y, mu, sigma, pwindow, swindow, obs_t) {
d <- y - pwindow + swindow
obs_time <- obs_t - pwindow
lpdf <- dlnorm(d, meanlog = mu, sdlog = sigma, log = TRUE)
lcdf <- plnorm(obs_time, meanlog = mu, sdlog = sigma, log.p = TRUE)
return(sum(lpdf - lcdf))
}
Based on F2F with @kgostic and @parksw3:
This function relies on the prep
object from brms
.
The prep
object has the following elements:
> names(prep)
[1] "ndraws" "nobs" "resp" "family" "dpars" "nlpars" "refcat" "ac" "data"
I am looking for the latent variables needed to compute the log likelihood. I think they should be in prep$data
but alas:
> names(prep$data)
[1] "Y" "vreal1" "vreal2" "vreal3" "" "" "wN" "noverlap" "woverlap" "" "" ""
prep
? Put in generated quantities?wN
noverlap
and woverlap
out? These are just deterministic quantities I guess?Here is what is going on inside brms::prepare_predictions
:
> brms:::prepare_predictions.brmsfit
function (x, newdata = NULL, re_formula = NULL, allow_new_levels = FALSE,
sample_new_levels = "uncertainty", incl_autocor = TRUE, oos = NULL,
resp = NULL, ndraws = NULL, draw_ids = NULL, nsamples = NULL,
subset = NULL, nug = NULL, smooths_only = FALSE, offset = TRUE,
newdata2 = NULL, new_objects = NULL, point_estimate = NULL,
ndraws_point_estimate = 1, ...)
{
x <- restructure(x)
options(.brmsfit_version = x$version$brms)
on.exit(options(.brmsfit_version = NULL))
snl_options <- c("uncertainty", "gaussian", "old_levels")
sample_new_levels <- match.arg(sample_new_levels, snl_options)
ndraws <- use_alias(ndraws, nsamples)
draw_ids <- use_alias(draw_ids, subset)
warn_brmsfit_multiple(x, newdata = newdata)
newdata2 <- use_alias(newdata2, new_objects)
x <- exclude_terms(x, incl_autocor = incl_autocor, offset = offset,
smooths_only = smooths_only)
resp <- validate_resp(resp, x)
draw_ids <- validate_draw_ids(x, draw_ids, ndraws)
draws <- as_draws_matrix(x)
draws <- suppressMessages(subset_draws(draws, draw = draw_ids))
draws <- point_draws(draws, point_estimate, ndraws_point_estimate)
sdata <- standata(x, newdata = newdata, re_formula = re_formula,
newdata2 = newdata2, resp = resp, allow_new_levels = allow_new_levels,
internal = TRUE, ...)
new_formula <- update_re_terms(x$formula, re_formula)
bframe <- brmsframe(new_formula, data = x$data)
prep_re <- prepare_predictions_re_global(bframe = bframe,
draws = draws, sdata = sdata, resp = resp, old_reframe = x$ranef,
sample_new_levels = sample_new_levels, )
prepare_predictions(bframe, draws = draws, sdata = sdata,
prep_re = prep_re, resp = resp, sample_new_levels = sample_new_levels,
nug = nug, new = !is.null(newdata), oos = oos, stanvars = x$stanvars)
}
Which I think is then feeding into:
> brms:::prepare_predictions.bframel
function (x, draws, sdata, ...)
{
ndraws <- nrow(draws)
nobs <- sdata[[paste0("N", usc(x$resp))]]
out <- nlist(family = x$family, ndraws, nobs)
class(out) <- "bprepl"
out$fe <- prepare_predictions_fe(x, draws, sdata, ...)
out$sp <- prepare_predictions_sp(x, draws, sdata, ...)
out$cs <- prepare_predictions_cs(x, draws, sdata, ...)
out$sm <- prepare_predictions_sm(x, draws, sdata, ...)
out$gp <- prepare_predictions_gp(x, draws, sdata, ...)
out$re <- prepare_predictions_re(x, sdata, ...)
out$ac <- prepare_predictions_ac(x, draws, sdata, nat_cov = FALSE,
...)
out$offset <- prepare_predictions_offset(x, sdata, ...)
out
}
So perhaps it looks like there are lots of these special parts prepare_predictions_xy
which get out parts of the draws that particular brms
models need. Maybe we are in the bad world where we'd need something like this. I don't know if there would be a way to tag the parameters we need as one of these things.
Again useful to have the Stan code to refer back to here:
> stancode
// generated with brms 2.21.6
functions {
// code chunks used from epidist 0.0.0.9000
// Here the strings
// * lognormal
// * vector mu, vector sigma
// * mu, sigma
// are/have been replaced using regex
real latent_lognormal_lpdf(vector y, vector mu, vector sigma, vector pwindow,
vector swindow, array[] real obs_t) {
int n = num_elements(y);
vector[n] d = y - pwindow + swindow;
vector[n] obs_time = to_vector(obs_t) - pwindow;
return lognormal_lpdf(d | mu, sigma) - lognormal_lcdf(obs_time | mu, sigma);
}
}
data {
int<lower=1> N; // total number of observations
vector[N] Y; // response variable
// data for custom real vectors
array[N] real vreal1;
// data for custom real vectors
array[N] real vreal2;
// data for custom real vectors
array[N] real vreal3;
int prior_only; // should the likelihood be ignored?
int wN;
array[N - wN] int noverlap;
array[wN] int woverlap;
}
transformed data {
}
parameters {
real Intercept; // temporary intercept for centered predictors
real Intercept_sigma; // temporary intercept for centered predictors
vector<lower = 0, upper = 1>[N] swindow_raw;
vector<lower = 0, upper = 1>[N] pwindow_raw;
}
transformed parameters {
real lprior = 0; // prior contributions to the log posterior
vector<lower = 0>[N] pwindow;
vector<lower = 0>[N] swindow;
swindow = to_vector(vreal3) .* swindow_raw;
pwindow[noverlap] = to_vector(vreal2[noverlap]) .* pwindow_raw[noverlap];
if (wN) {
pwindow[woverlap] = swindow[woverlap] .* pwindow_raw[woverlap];
}
lprior += normal_lpdf(Intercept | 2, 0.5);
lprior += normal_lpdf(Intercept_sigma | 0, 0.5);
}
model {
swindow_raw ~ uniform(0, 1);
pwindow_raw ~ uniform(0, 1);
// likelihood including constants
if (!prior_only) {
// initialize linear predictor term
vector[N] mu = rep_vector(0.0, N);
// initialize linear predictor term
vector[N] sigma = rep_vector(0.0, N);
mu += Intercept;
sigma += Intercept_sigma;
sigma = exp(sigma);
target += latent_lognormal_lpdf(Y | mu, sigma, pwindow, swindow, vreal1);
}
// priors including constants
target += lprior;
}
generated quantities {
// actual population-level intercept
real b_Intercept = Intercept;
// actual population-level intercept
real b_sigma_Intercept = Intercept_sigma;
}
As to what should be happening:
Usually there would be some parameters in the $\theta$ (or transformations of them) which would be sufficient for the likelihood. Things like the "distributional parameters" which are extracted out with brms::get_dpar
.
My guess is that for the "pointwise" log-likelihood we do want to be using all of the draws of $\theta$ and not integrating anything out.
The alternative would be something like. Let $\theta = (\phi, \lambda)$. Where $\phi$ are "latent" variables. Then the marginal likelihood is:
$$ p(y | \lambda) = \int p(y | \phi, \lambda) p(\phi | \lambda) \text{d}\phi $$
I've managed to get something working here by:
Generating the latent variables randomly rather than extracting them
Ignoring the overlap bit
should be easy to fix. 1. might be hard.
Added a question on the Stan forum about this: https://discourse.mc-stan.org/t/log-likelihood-for-brms-custom-family-that-depends-on-latent-variables/36415
For compatibility of our models with model comparison tools like
loo
we need to write functions for the pointwise log likelihood.For example, for the latent lognormal , this would look like the following:
Here I have used
latent_lognormal_lpdf
as if it were exported from the Stan code, but it's not possible to do that.I think that the challenging part of doing this is that the latent variables like
swindow_raw
andpwindow_raw
need to be integrated out. I could probably write some way to do this using quadrature.Is this a problem you've already encountered in the academic side of this work @seabbs @kgostic @parksw3? Any pointers?
How does this fit into the bigger picture
loo
functionality (I think)