nimble-dev / nimble

The base NIMBLE package for R
http://R-nimble.org
BSD 3-Clause "New" or "Revised" License
156 stars 23 forks source link

How to do truncation with automatic derivatives? #1398

Open pehkawn opened 8 months ago

pehkawn commented 8 months ago

I am trying to run a BUGS model using nimbleHMC(). This requires automatic derivatives (AD) enabled. However, when trying to create the model with readBUGSmodel(), after setting nimbleOptions(buildModelDerivs = TRUE), returns the following warning:

[Warning] Truncation via 'T' or 'I' is not supported for derivatives. This model cannot be compiled.

Attempting to run the model with either nimbleHMC() or nimbleMCMC() will result in the following error:

Error: Failed to create the shared library. Run 'printErrors()' to see the compilation errors.

Set nimbleOptions(buildModelDerivs = FALSE) and create the model object again, and I can run nimbleMCMC() without errors.

I am using truncation in the model to constrain the distributions of continuous variables based on given thresholds thd to represent levels in ordinal variables z in the data.

for(i in 1:N){
    for(j in 1:P){
            y[i, j] ~ dnorm(mu[i, j], psi[j])T(thd[j, z[i, j]], thd[j, z[i, j] + 1])
    }
}

Since the error says specifically "Truncation via 'T' or 'I' is not supported for derivatives.", it implies there is another method to do truncation in Nimble that is supported for derivatives.

I was wondering how truncation can be accomplished, that are supported with derivatives?

perrydv commented 8 months ago

Hi @pehkawn Thanks for the issue. We encourage support questions to go to the nimble-users list (see r-nimble.org). Some of the error-trapping from the (still new) AD support is inconsistent and we are working to improve that. I can see how the word "via" is ambiguous, but indeed it means that truncation is not supported for AD. That is because the implementation of T uses CDFs and those are not built into the AD system yet (and in some cases may not be easy to support). @paciorek may want to chime in. HTH.

paciorek commented 8 months ago

One case that should work very soon is that if you had additional parameters in the model (i.e., not mu,psi,thd) then you should be able to use HMC on those other parameters and non-HMC sampling for the parameters that we can't take derivatives with respect to because it would involve the truncated density,, which involves CDFs.

We are about to release a new version of nimble in the next couple weeks that allows users more flexibility in terms of what can be handled in models with respect to derivatives and therefore HMC.

In the new version, something like this should work, using HMC on mu0, but not on mu:

code <- nimbleCode({
    y ~ T(dnorm(mu,1),0,Inf)
    mu ~ dnorm(mu0,1)
    mu0 ~ dnorm(0,1)
})

m <- nimbleModel(code, data = list(y=0), buildDerivs=TRUE)
cm <- compileNimble(m)

library(nimbleHMC)
conf <- configureMCMC(m)
conf$removeSampler('mu0')
conf$addSampler("NUTS")

mcmc <- buildMCMC(conf)
cmcmc <- compileNimble(mcmc, project = m)
runMCMC(cmcmc,niter=10)