James-Thorson / Conway-Maxwell-Poisson

R package for calculating probability mass function or simulating draws from a Conway-Maxwell-Poisson distributimon
4 stars 0 forks source link

TMB code? #1

Closed mebrooks closed 7 years ago

mebrooks commented 7 years ago

Hi Jim, I just wanted to check before I reinvent the wheel...you didn't by chance write versions of dCMP and compute_CMP_constant in C++ for TMB did you? I was thinking about adding CMP to glmmTMB. If you haven't done it already, then I can just translate your R into C++. Cheers!

James-Thorson commented 7 years ago

Yes, I did implement in TMB in SpatialDeltaGLMM here

// CMP distribution
template<class Type>
Type dCMP(Type x, Type mu, Type nu, int give_log=0, int iter_max=30, int break_point=10){
  // Explicit
  Type ln_S_1 = nu*mu - ((nu-1)/2)*log(mu) - ((nu-1)/2)*log(2*M_PI) - 0.5*log(nu);
  // Recursive
  vector<Type> S_i(iter_max);
  S_i(0) = 1;
  for(int i=1; i<iter_max; i++) S_i(i) = S_i(i-1) * pow( mu/Type(i), nu );
  Type ln_S_2 = log( sum(S_i) );
  // Blend (breakpoint:  mu=10)
  Type prop_1 = invlogit( (mu-break_point)*5 );
  //Type S_comb = prop_1*exp(ln_S_1) + (1-prop_1)*exp(ln_S_2);
  Type log_S_comb = prop_1*ln_S_1 + (1-prop_1)*ln_S_2;
  // Likelihood
  Type loglike = nu*x*log(mu) - nu*lgamma(x+1) - log_S_comb;
  // Return
  if(give_log) return loglike; else return exp(loglike);
}

I seem to remember it having trouble with the logistic transition from ln_S_1 (closed-form) to ln_S_2 (finite-sum) versions

mebrooks commented 7 years ago

Thanks Jim. Do you think the trouble you mentioned is bad enough that I shouldn't use this? Did you use it in one of your papers? I guess this combination of the closed-form and finite-sum is to avoid the if-statement in the R code about summing close enough to infinity. Which reference should I check out to explain the reasoning for the combination here?

James-Thorson commented 7 years ago

I haven't used the TMB code for CMP in any paper, our original paper exploring its potential for ecological applications used R code exclusively. And that paper outlines the two (recursive and closed-form) calculations as well as why we use this parameterization (where you could include a log-linked linear predictor for mu)

I built the CMP into SpatialDeltaGLMM to explore spatio-temporal patterns in species richness, but never did get around to publishing my early explorations for the Eastern Bering Sea (didn't seem sufficiently interesting). So no, not published

Personally, I would try building it into glmmTMB and seeing how it performs: I think my problem was that the logistic ramp (which yes, avoids an if-then statement) was causing inner-optimizer problems in a spatial model that had thousands of random effects. With smaller mixed models it seems less likely that any would be precisely at the ramp-point.

mebrooks commented 7 years ago

Thanks Jim! Your code works great!

mebrooks commented 7 years ago

I just wanted to follow up on this. TMB now contains dcompois, dcompois2, rcompois, and rcompois2, documented here. The compois functions are parameterized with the approximate mode while the compois2 ones use the mean by solving for it internally.

James-Thorson commented 7 years ago

That's great, thanks @mebrooks!

Have you thought about adding a compound Poisson-gamma ("Tweedie") distribution too? I've just been implementing it using the Foster and Bravington (2012) parameterization (log-linked linear predictors on the expected number of individuals, and the average weight of each individual), and code is available as function dPoisGam here, although its quite slow (requires approximating a summation across an infinite series, similar to the CMP)

I'd also love to explore adding the N-mixture model to standard GLM packages in case you're interested to discuss. Maybe a topic for the developers workshop...