Closed danielturek closed 8 months ago
@paciorek After some additional reflection, I now feel the correct approach is to push the zero_mean
constraint into the individual scalar updates of the CAR process components.
The CAR_normal
sampler itself iterates over (nested) scalar samplers operating on the scalar components of the CAR process. These scalar samplers come in 3 different flavors, which are mixed and matched at the time of MCMC configuration/building according to:
CAR_scalar_postPred
- assigned to scalar components with no stochastic dependenciesCAR_scalar_conjugate
- assigned to scalar components in conjugate relationship with dependenciesCAR_scalar_RW
- fallback assignmentMy belief is that these 3 separate scalar CAR samplers should now individually, internally, handle the zero_mean
constraint, as:
CAR_scalar_postPred
- "propose" an update to target scalar by drawing from prior; center entire CAR process; if logProb is valid, then accept the update to target scalar, and also to the entire CAR process (centering the entire process).CAR_scalar_conjugate
- "propose" an update to target scalar using conjugacy; center entire CAR process; if logProb is valid, then accept the update to target scalar, and also to the entire CAR process (centering the entire process).CAR_scalar_RW
- genuinely propose an update to target scalar using MH proposal; center entire CAR process; if logProb is valid and MH decides to accept, then accept the update to target scalar, and also to the entire CAR process (centering the entire process).I believe this approach maintains the reversibility of the Markov chain.
I've never been sure that the centering on the fly is strictly valid though of course people do it a lot. This article seems like it might be useful to understand the issue better. Perhaps take a look and see if relevant (I have not read it)?
I am less familiar with CAR than both of you, but here are my quick thoughts. I am not sure having each of the scalar CAR samplers recenter the entire vector is a good idea because wouldn't that break the efficiency motivation of CAR, which is to update nodes one at a time without requiring computations on other nodes? I am also not sure that it would make sense (per first suggestion) to have the behavior be to leave nodes uncentered in the event that centering would create some invalid value in the model. That seems potentially like more complicated and less internally consistent behavior. It strikes me that this is a case where simply the error can arise and in that case the centering should not be used. It seems like a rare case.
Whether the centering on the fly is strictly valid seems like a somewhat distinct issue.
We could enhance documentation on the topic.
@paciorek @perrydv Given everything, and a mild hesitation to revamp the CAR_normal
sampler with a new "centering on the fly" algorithm, my current preference is:
In the CAR_normal
sampler, if zero_mean = 1
, add the following code:
if(zero_mean) {
targetValues <- values(model, target)
values(model, target) <<- targetValues - mean(targetValues)
logProb <- model$calculate(calcNodes)
## new code follows:
if(logProb is *not* valid) {
print a warning message, explaining what happened.
}
}
Note this does not add an additional model$calculate
call to every iteration of the CAR_normal
sampler, since this calculation occured anyway after the centering (see original code, in the first message above) and will help diagnose and warn about this (uncommon) situation.
What do you guys think? Pending agreement, I'll draft the code.
@danielturek @perrydv I support this change to add a warning. I suggest that the warning explicitly say that the invalid values occurred because of the centering step.
Addressed in PR #1400
@perrydv @paciorek
Perry, Chris, I've uncovered a somewhat minor issue, but an issue nonetheless with the
CAR_normal
sampler. This all has to do with thezero_mean
option for thedcar_normal
distribution:When
zero_mean = 1
(as above), it has the effect of invoking the following code in theCAR_normal
sampler, which just goes ahead and "centers" the CAR process values:(this "centering" takes place after sequentially updating each scalar component of the CAR process).
The issue this can cause: by modifying all the CAR process values (
target
nodes) without doing any checking, this can cause other parts of the model to become invalid. The case I ran into is somewhat involved, and perhaps there are much simpler examples, but again, by modifying all the CAR process values (centering them around zero), other model dimensions which depend on them can be pushed into invalid regions. (The specific example I ran into uses a user-defined distribution for latent states, which depends on the CAR process values, and the density calculation has internal boundary thresholds.)One option is to not use
zero_mean = 1
, and also remove the other parameter (representing the mean) from the model. This is a solution, which does seem to work in my case.But I also would like to fix the sampler, so it works with
zero_mean = 1
. One option that comes to mind is akin to modifying theCAR_normal
sampler code as:This is not wholly satisfying either, since I could imagine the CAR process values straying further and further from being "centered", and then perhaps a lack of identifyability with the other "mean" parameter in the model, and who knows what else.
Another more involved option could be to move the "centering" into each (nested) scalar update of the CAR process components. That is, the
CAR_normal
sampler operates by iterating over updates of each scalar CAR component, and the update of each scalar CAR component is something like:At some point, it feels like this raises into question the "reversibility" of the Markov chain.
I would very much welcome any thoughts from @paciorek .
Our current implementation of
zero_mean = 1
is somewhat ad hoc, and the original BUGS implementation perhaps also would not work with arbitrary user-defined distributions.