Closed jsocolar closed 10 months ago
I see the problem. However, I would not want to prevent these models from being excluded from mixture models because of this. Perhaps a warning could be sufficient.
I defer to you (obviously), but want to make sure that it's clear that this isn't a problem in pp_mixture()
but rather a problem with the generated stancode.
set.seed(2)
dat <- data.frame(
y = c(rep(0, 100), runif(10) rnorm(5, .5, .1))
)
mix <- mixture(gaussian, zero_inflated_beta)
make_stancode(y ~ 1, dat, family = mix)
Yields stan code including
for (n in 1:N) {
array[2] real ps;
ps[1] = log(theta1) + normal_lpdf(Y[n] | mu1[n], sigma1);
ps[2] = log(theta2) + zero_inflated_beta_lpdf(Y[n] | mu2[n], phi2, zi2);
target += log_sum_exp(ps);
}
But it's not correct to log_sum_exp
through a mixture of probability densities and probability masses. The correct code here would be
for (n in 1:N) {
array[2] real ps;
if(y[n] == 0){
target += log(theta2) + log(zi2);
} else {
ps[1] = log(theta1) + normal_lpdf(Y[n] | mu1[n], sigma1);
ps[2] = log(theta2) + log1m(zi2) + beta_lpdf(Y[n] | mu2[n], phi2);
target += log_sum_exp(ps);
}
}
Here's a simple example that shows the consequences for parameter recovery:
library(brms)
set.seed(2)
dat <- data.frame(
y = c(rep(0, 100), runif(20), rnorm(20, .5, .1))
)
mix <- mixture(gaussian, zero_inflated_beta)
prior <- c(
prior(normal(.5, .05), Intercept, dpar = mu1),
prior(normal(0, .2), Intercept, dpar = mu2),
prior(lognormal(1,1), sigma1, lb = 0.01)
)
fit <- brm(y ~ 1, dat, family = mix,
prior = prior,
backend = "cmdstanr",
cores = 4)
summary(fit)
I included the prior lb
on sigma1
because this model desperately wants to estimate sigma1
on the boundary, which leads to very poor sampling when the boundary is zero.
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
mu1_Intercept 0.00 0.00 -0.00 0.00 1.00 4359 2987
mu2_Intercept 0.06 0.12 -0.17 0.30 1.00 3651 2576
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma1 0.01 0.00 0.01 0.01 1.00 2869 1761
phi2 3.47 0.68 2.28 4.88 1.00 3765 2475
zi2 0.02 0.02 0.00 0.09 1.00 2789 1638
theta1 0.71 0.04 0.63 0.78 1.00 3981 2731
theta2 0.29 0.04 0.22 0.37 1.00 3981 2731
Even with the lb
on sigma1
(not to mention the zero-avoiding prior), this model estimates mu1
as zero; i.e. in the ten-sigma region of its marginal prior. But the normal part of the likelihood should never even see any of the zeros in the data--in the correct Stan code, they would all go straight to the zero-inflation term via the control flow.
I think that pp_mixture()
is in fact returning the correct mixture weights given the model that was fitted; i.e. treating the probability of a structural zero via zi as a probability density rather than a probability mass.
Note that the Stan code for mixtures including zero_inflated_poisson
, zero_inflated_binomial
etc is unproblematic, because these families don't mix probability masses and probability densities, and mixture()
's requirement not to mix integer and continuous families will prevent them from being mixed with probability densities later on.
Ah, that makes sense. So the problem are the continuoues zi/hu families, such as zero_inflated_lognormal?
Yup!
Got it. Will change accordingly.
Fixed :-)
While tinkering with https://github.com/paul-buerkner/brms/issues/1551, I realized that
mixture()
misbehaves as follows:If I do
I error with
This makes sense.
But if I do:
brms
doesn't complain. Moreover, if I doI get (in addition to a bunch of divergences) the final output
This is incorrect; rows 1 and 2 are structurally guaranteed to arise from the zero-one-inflated-beta component which puts probability mass rather than probability density over zero and one.
I think the solution is to prohibit hurdle families as arguments to
mixture()
.