nimble-dev / nimble

The base NIMBLE package for R
http://R-nimble.org
BSD 3-Clause "New" or "Revised" License
156 stars 23 forks source link

Minor bug in `CAR_normal` sampler #1387

Closed danielturek closed 8 months ago

danielturek commented 8 months ago

@perrydv @paciorek

Perry, Chris, I've uncovered a somewhat minor issue, but an issue nonetheless with the CAR_normal sampler. This all has to do with the zero_mean option for the dcar_normal distribution:

s[1:N] ~ dcar_normal(..., zero_mean = 1)

When zero_mean = 1 (as above), it has the effect of invoking the following code in the CAR_normal sampler, which just goes ahead and "centers" the CAR process values:

if(zero_mean) {
    targetValues <- values(model, target)
    values(model, target) <<- targetValues - mean(targetValues)
    model$calculate(calcNodes)
    nimCopy(from = model, to = mvSaved, row = 1, nodes = calcNodes, logProb = TRUE)
}

(this "centering" takes place after sequentially updating each scalar component of the CAR process).

The issue this can cause: by modifying all the CAR process values (target nodes) without doing any checking, this can cause other parts of the model to become invalid.  The case I ran into is somewhat involved, and perhaps there are much simpler examples, but again, by modifying all the CAR process values (centering them around zero), other model dimensions which depend on them can be pushed into invalid regions.  (The specific example I ran into uses a user-defined distribution for latent states, which depends on the CAR process values, and the density calculation has internal boundary thresholds.)

One option is to not use zero_mean = 1, and also remove the other parameter (representing the mean) from the model.  This is a solution, which does seem to work in my case.

But I also would like to fix the sampler, so it works with zero_mean = 1.  One option that comes to mind is akin to modifying the CAR_normal sampler code as:

if(zero_mean) {
    targetValues <- values(model, target)
    values(model, target) <<- targetValues - mean(targetValues)
    logProb <- model$calculate(calcNodes)
    if(logProb is valid) {
        nimCopy(from = model, to = mvSaved, row = 1, nodes = calcNodes, logProb = TRUE)
    }
    if(logProb is *not* valid) {
        undo these "centering" changes, and leave uncentered for this iteration
    }
}

This is not wholly satisfying either, since I could imagine the CAR process values straying further and further from being "centered", and then perhaps a lack of identifyability with the other "mean" parameter in the model, and who knows what else.

Another more involved option could be to move the "centering" into each (nested) scalar update of the CAR process components. That is, the CAR_normal sampler operates by iterating over updates of each scalar CAR component, and the update of each scalar CAR component is something like:

At some point, it feels like this raises into question the "reversibility" of the Markov chain.

I would very much welcome any thoughts from @paciorek .

Our current implementation of zero_mean = 1 is somewhat ad hoc, and the original BUGS implementation perhaps also would not work with arbitrary user-defined distributions.

danielturek commented 8 months ago

@paciorek After some additional reflection, I now feel the correct approach is to push the zero_mean constraint into the individual scalar updates of the CAR process components.

The CAR_normal sampler itself iterates over (nested) scalar samplers operating on the scalar components of the CAR process. These scalar samplers come in 3 different flavors, which are mixed and matched at the time of MCMC configuration/building according to:

My belief is that these 3 separate scalar CAR samplers should now individually, internally, handle the zero_mean constraint, as:

I believe this approach maintains the reversibility of the Markov chain.

paciorek commented 8 months ago

I've never been sure that the centering on the fly is strictly valid though of course people do it a lot. This article seems like it might be useful to understand the issue better. Perhaps take a look and see if relevant (I have not read it)?

perrydv commented 8 months ago

I am less familiar with CAR than both of you, but here are my quick thoughts. I am not sure having each of the scalar CAR samplers recenter the entire vector is a good idea because wouldn't that break the efficiency motivation of CAR, which is to update nodes one at a time without requiring computations on other nodes? I am also not sure that it would make sense (per first suggestion) to have the behavior be to leave nodes uncentered in the event that centering would create some invalid value in the model. That seems potentially like more complicated and less internally consistent behavior. It strikes me that this is a case where simply the error can arise and in that case the centering should not be used. It seems like a rare case.

Whether the centering on the fly is strictly valid seems like a somewhat distinct issue.

We could enhance documentation on the topic.

danielturek commented 8 months ago

@paciorek @perrydv Given everything, and a mild hesitation to revamp the CAR_normal sampler with a new "centering on the fly" algorithm, my current preference is:

In the CAR_normal sampler, if zero_mean = 1, add the following code:

if(zero_mean) {
    targetValues <- values(model, target)
    values(model, target) <<- targetValues - mean(targetValues)
    logProb <- model$calculate(calcNodes)
    ## new code follows:
    if(logProb is *not* valid) {
        print a warning message, explaining what happened.
    }
}

Note this does not add an additional model$calculate call to every iteration of the CAR_normal sampler, since this calculation occured anyway after the centering (see original code, in the first message above) and will help diagnose and warn about this (uncommon) situation.

What do you guys think? Pending agreement, I'll draft the code.

paciorek commented 8 months ago

@danielturek @perrydv I support this change to add a warning. I suggest that the warning explicitly say that the invalid values occurred because of the centering step.

danielturek commented 8 months ago

Addressed in PR #1400