Closed mebrooks closed 7 years ago
Yes, I did implement in TMB in SpatialDeltaGLMM
here
// CMP distribution
template<class Type>
Type dCMP(Type x, Type mu, Type nu, int give_log=0, int iter_max=30, int break_point=10){
// Explicit
Type ln_S_1 = nu*mu - ((nu-1)/2)*log(mu) - ((nu-1)/2)*log(2*M_PI) - 0.5*log(nu);
// Recursive
vector<Type> S_i(iter_max);
S_i(0) = 1;
for(int i=1; i<iter_max; i++) S_i(i) = S_i(i-1) * pow( mu/Type(i), nu );
Type ln_S_2 = log( sum(S_i) );
// Blend (breakpoint: mu=10)
Type prop_1 = invlogit( (mu-break_point)*5 );
//Type S_comb = prop_1*exp(ln_S_1) + (1-prop_1)*exp(ln_S_2);
Type log_S_comb = prop_1*ln_S_1 + (1-prop_1)*ln_S_2;
// Likelihood
Type loglike = nu*x*log(mu) - nu*lgamma(x+1) - log_S_comb;
// Return
if(give_log) return loglike; else return exp(loglike);
}
I seem to remember it having trouble with the logistic transition from ln_S_1
(closed-form) to ln_S_2
(finite-sum) versions
Thanks Jim. Do you think the trouble you mentioned is bad enough that I shouldn't use this? Did you use it in one of your papers? I guess this combination of the closed-form and finite-sum is to avoid the if-statement in the R code about summing close enough to infinity. Which reference should I check out to explain the reasoning for the combination here?
I haven't used the TMB code for CMP in any paper, our original paper exploring its potential for ecological applications used R code exclusively. And that paper outlines the two (recursive and closed-form) calculations as well as why we use this parameterization (where you could include a log-linked linear predictor for mu)
I built the CMP into SpatialDeltaGLMM
to explore spatio-temporal patterns in species richness, but never did get around to publishing my early explorations for the Eastern Bering Sea (didn't seem sufficiently interesting). So no, not published
Personally, I would try building it into glmmTMB
and seeing how it performs: I think my problem was that the logistic ramp (which yes, avoids an if-then statement) was causing inner-optimizer problems in a spatial model that had thousands of random effects. With smaller mixed models it seems less likely that any would be precisely at the ramp-point.
Thanks Jim! Your code works great!
I just wanted to follow up on this. TMB now contains dcompois
, dcompois2
, rcompois
, and rcompois2
, documented here. The compois
functions are parameterized with the approximate mode while the compois2
ones use the mean by solving for it internally.
That's great, thanks @mebrooks!
Have you thought about adding a compound Poisson-gamma ("Tweedie") distribution too? I've just been implementing it using the Foster and Bravington (2012) parameterization (log-linked linear predictors on the expected number of individuals, and the average weight of each individual), and code is available as function dPoisGam
here, although its quite slow (requires approximating a summation across an infinite series, similar to the CMP)
I'd also love to explore adding the N-mixture model to standard GLM packages in case you're interested to discuss. Maybe a topic for the developers workshop...
Hi Jim, I just wanted to check before I reinvent the wheel...you didn't by chance write versions of
dCMP
andcompute_CMP_constant
in C++ for TMB did you? I was thinking about adding CMP to glmmTMB. If you haven't done it already, then I can just translate your R into C++. Cheers!