Metropolis MCMC using Riemann Langevin updates

kaskr / adcomp

AD computation with Template Model Builder (TMB)

Other

167 stars 80 forks source link

Metropolis MCMC using Riemann Langevin updates #182

Closed James-Thorson closed 3 years ago

James-Thorson commented 8 years ago

Kasper and everyone,

I was reading through a method for MCMC sampling using Langevin proposals on a Riemann manifold of the neg-log-like function:

http://arxiv.org/pdf/1309.2983v1.pdf

courtesy of Daniel Simpson. The algorithm between Eq. 9 and 10 should be feasible to implement and test, except that Gamma_i requires the gradient of elements of the hessian matrix w.r.t model coefficients. I have read that TMB has access to higher-order derivatives -- is this gradient-of-the-hessian calculation convenient?

cheers, jim

skaug commented 8 years ago

Yes, the "gradient-of-the-hessian" is used in TMB to get the derivative of Laplace approximation of with respect to model parameters.

James-Thorson commented 8 years ago

I had imagined that the gradient of the log-determinant of the Hessian is available (as you confirm), but haven't seen how to access the gradient of individual elements of the Hessian matrix. Is there any simple example-code for accessing this gradient-of-Hessian-elements within the R terminal, or I guess equivalently accessing 3rd derivatives of the neg-log-like function w.r.t. one parameter twice and a 2nd parameter once?

jim

On Mon, Apr 11, 2016 at 2:26 AM, Hans J. Skaug notifications@github.com wrote:

Yes, the "gradient-of-the-hessian" is used in TMB to get the derivative of Laplace approximation of with respect to model parameters.

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/kaskr/adcomp/issues/182#issuecomment-208248710

kaskr commented 8 years ago

@James-Thorson There is not yet a user friendly way to get these derivatives. The following function can be used for now:

get3rdDeriv <- function(obj, par = obj$env$par, w = NULL){
    ADHess <- environment(obj$env$spHess)$ADHess
    .Call("EvalADFunObject",
          ADHess$ptr,
          par,
          control = list(
              order = as.integer(1), 
              hessiancols = integer(0),
              hessianrows = integer(0),
              sparsitypattern = as.integer(0), 
              rangecomponent = as.integer(1),
              dumpstack = as.integer(0),
              rangeweight = w),
          PACKAGE = obj$env$DLL)
}

Here's a demo:

library(TMB)
runExample("simple")

Get all derivatives (280 non-zero hessian elements times 118 parameters) :

D3mat <- get3rdDeriv(obj)
dim(D3mat)

You can get specific linear combinations of derivatives very fast (using only one reverse sweep):

w <- rnorm(280)
wD3mat <- get3rdDeriv(obj, w=w)
range( wD3mat - t(w) %*% D3mat )

colemonnahan commented 8 years ago

@James-Thorson FYI Michael Betancourt has some papers on RHMC and from what I understand you have to be careful w/ using the Hessian because it isn't always positive definite.

Unfortunately the Hessian isn’t sufficiently well-behaved to serve as a metric itself; in general, it is not even guaranteed to be positive-definite. The Hessian can be manipulated into a well-behaved form, however, by applying the SoftAbs transformation and the resulting SoftAbs metric [22] admits a generic but efficient Riemannian Hamiltonian Monte Carlo implementation (Figure 10, 11).

From http://arxiv.org/pdf/1312.0906.pdf

I'd be curious to see if you could get it working with TMB. Supposedly it works great for low dimensional posteriors with very difficult geometries.