Robustify handling of numerical issues in samplers

paciorek commented 2 weeks ago

WIP

This is a draft that addresses NCT issue 538.

Per @danielturek , it creates checkLogProb, which has two purposes - one to replace NaN values with -Inf (since we generally want to take a calculation that produces NaN as a rejection), and to print a warning message when an Inf is found.

I've modified various samplers and utility functions to use the new function. It feels a bit kludgey in that we handle logProb calculations (and some degree of checking already) in various ways in our samplers. I've tried to interject checkLogProb in the most reasonable locations in the samplers, but my choices are open to other ideas. Given the accretion of the various components of acceptance/rejection, it was hard to do this in clean, consistent way. My goal is simply to make the samplers more robust.

@danielturek I'd particularly like your review of this.

Will be interesting to see what happens with testing. Given we haven't seen a lot of tickets related to numerical issues, I'm hoping these changes will have little effect. However, I am a bit worried that there have been cases of using samplers where Inf values might arise (such as the stickbreaking nimble-users post that inspired all this) where the sampling might give invalid results.

I'll also note that PR #1509 is related to this.

Other things to consider:

[ ] perhaps use checkLogProb in the new code in sampler_CRP in PR #1509.
[ ] look at SMC samplers to see if we want to do this type of work there
[ ] look at HMC samplers as well

danielturek commented 1 week ago

@paciorek Thanks for tagging me on this.

This feels like I'm overlooking something for certain, but how is it that decide is defined as a regular R function (as opposed to a nimbleFunction), and is used extensively in run function code of samplers? What am I overlooking?

Would another call to checkLogProb be appropriate for line 372 of MCMC_samplers.R? This is in the core RW sampler, where we separate the logProbs for the target node and dependent nodes:

            logMHR <- logMHR + model$calculateDiff(calcNodesNoSelf) + propLogScale

How about on line 121 of MCMC_utils.R, which is in the function setAndCalculateOne ?

       lp <- model$calculate(calcNodes)

Same for line 164 of MCMC_utils.R, which is in the function setAndCalculate ?

Same for line 180 of MCMC_utils.R, which is in the function setAndCalculateDiff ?

(I know that sometimes the result of these functions in the setAndCalculate family are later wrapped in checkLogProb. But from my review, this isn't always the case, so my suggestions above are defensive).

For as much as I've tried to standardize things, looking over this has really shown me what a hodgepodge of different structures and approaches the various samplers use ...

Overall, this PR feels like a good approach, given that we want to handle this corner case.

paciorek commented 1 week ago

@danielturek thanks, I will look at your suggestions in detail.

As far as decide, it's also a C++ function in Utils.cpp.

paciorek commented 1 week ago

Thanks for catching the missing checkLogProb in sampler_RW.

As far as potential uses in those locations in MCMC_utils.R, my logic was to use checkLogProb just before the result would be used for a accept/reject decision or something similar, and not just whenever we call calculate. So I'm inclined to leave them out of MCMC_utils.R except in decideAndJump.

nimble-dev / nimble

Robustify handling of numerical issues in samplers #1512