Here's an overview of the model that should help clarify what the problem is:
State model
These models are two-component finite mixture models, where the components correspond to a partly observed binary latent variable z[i] for individuals i=1, ..., N. The two components are:
z[i] = 0 the individual is not in the population
z[i] = 1 the individual is in the population
This latent quantity is modeled as a Bernoulli: z[i] ~ Bernoulli(omega).
Observation model
There are T sampling occasions, and on each sampling occasion an individual is either detected or not. The observation model is y[i, t] ~ Bernoulli(p * z[i]) for t=1, ..., T, or summing the number of detections to use a binomial instead, y[i] ~ Binomial(T, p * z[i]).
Essentially, when an individual is observed, we know z[i] = 1, and otherwise we have to sum over z[i] = 0 and z[i] = 1 in the likelihood.
The issue
For some of these capture-recapture models, there's an error in the likelihood associated with the case when an individual is in the population but not detected (z[i] = 1, and max(y[i, 1:T]) = 0).
The observation model conditional on z[i]=1 should be binomial_lpmf(0 | T, p), or equivalently bernoulli_lpmf(y[i] | p), where y[i] is an integer array of 0's of length T. The following models have a subtle error in that they incorrectly use bernoulli_lpmf(0 | p) instead:
There are some subtle errors in the likelihoods of some capture-recapture models (Chapter 6 of the Bayesian Population Analysis translations). I noticed the errors as a side effect of digging into a separate issues on the Stan Forums: https://discourse.mc-stan.org/t/capture-recapture-model-with-partial-or-complete-pooling/20393/32
Here's an overview of the model that should help clarify what the problem is:
State model
These models are two-component finite mixture models, where the components correspond to a partly observed binary latent variable z[i] for individuals i=1, ..., N. The two components are:
z[i] = 0
the individual is not in the populationz[i] = 1
the individual is in the populationThis latent quantity is modeled as a Bernoulli:
z[i] ~ Bernoulli(omega)
.Observation model
There are T sampling occasions, and on each sampling occasion an individual is either detected or not. The observation model is
y[i, t] ~ Bernoulli(p * z[i])
for t=1, ..., T, or summing the number of detections to use a binomial instead,y[i] ~ Binomial(T, p * z[i])
.Stan implementation
Marginalizing over the discrete state
z
leads to the following integrated observation model: https://github.com/stan-dev/example-models/blob/master/BPA/Ch.06/M0.stan#L29-L38Essentially, when an individual is observed, we know
z[i] = 1
, and otherwise we have to sum overz[i] = 0
andz[i] = 1
in the likelihood.The issue
For some of these capture-recapture models, there's an error in the likelihood associated with the case when an individual is in the population but not detected (
z[i] = 1
, andmax(y[i, 1:T]) = 0
).The observation model conditional on
z[i]=1
should bebinomial_lpmf(0 | T, p)
, or equivalentlybernoulli_lpmf(y[i] | p)
, wherey[i]
is an integer array of 0's of length T. The following models have a subtle error in that they incorrectly usebernoulli_lpmf(0 | p)
instead:For an example of a model with the correct likelihood, see M0: https://github.com/stan-dev/example-models/blob/master/BPA/Ch.06/M0.stan#L37