Closed paciorek closed 4 months ago
@paciorek Thanks for tagging me on this.
To address your highest level design questions upfront:
RW
sampler, rather than being its own standalone sampler. That degree of code duplication would not be ok.control
list, as you've implemented it.ASIS
. Perhaps includeRE
? I think that would be better.Another design option to consider: perhaps only having one control
argument, called includeRE
. If it has the default value of FALSE
, operation is the usual RW
sampler. If includeRE
is provided as TRUE
, then we automatically do some model processing and checking, and self-determine asisType
and asisEffects
. And, if includeRE = TRUE
but the model structure is not amenable to this, we throw an error in the setup
function. I feel I somewhat prefer this, to make it (much) more automated. Discussion is welcomed, I know it reduces fine-grain-user-control, but it feels reasonable to me in this case.
Prior to line 256 (subsetting ccList$calcNodesNoSelf
) would it be wise to force asisEffects
through expandNodeNames
)? When asisEffects
is user-supplied, I imagine it would usually be only a variable name re
, or perhaps indexed as re[1:n]
.
Would likes 256 and 257 be more clear (and perhaps more efficient) if written using setdiff
?
Also, it seems line 257 might have a typo. As currently written, wouldn't this give a warning about non-multiple vector lengths? Was it intended to be:
ccList$calcNodesNoSelf <- ccList$calcNodesNoSelf[!ccList$calcNodesNoSelf %in% asisEffects]
I agree a lot of the checks should be added, per the comments in the current code. Perhaps also add:
## TODO: check asisEffects none are data (quite defensive)
## TODO: check asisEffects all come from the same declID (perhaps more important)
The second of those, above, seems important to cover some non-standard cases, e.g. some of the random effects come from a different declaration for example:
for(i in 1:r) b[i] ~ dnorm(target, sd = whatever)
for(i in (r+1):n) b[i] ~ dnorm(target + something_else, sd = whatever)
or one could imagine where the mean is target
in both cases, but the sd
arguments are different: whatever1
and whatever2
.
Documentation at the end of MCMC_samplers.R
should be updated, to include the Yu and Meng (2011) reference.
I would please request some formatting changes, to be consistent with other code/samplers, but that can come later (and I'd be quite happy to make those changes).
Could a test be added, to demonstrate the correct asymptotic posterior is achieved?
@danielturek thanks for this very useful and detailed feedback.
On the question of automated checking, I agree this would be nice. However, I am wary of (a) being confident about catching all the special cases given our past experience in analogous situations and (b) the extra work of doing that given other priorities. My thought was that we describe to users when the sampler is appropriate and move some of the burden onto the user, albeit with some double-checking.
@paciorek What you said makes sense; I certainly understand your point that "in the past, we're never able to anticipate and handle all use-cases; users always come up with something we didn't imagine".
My only lingering hesitation, is this approach makes the new option a big step more annoying to use. It would be so much nicer to write:
conf$addSampler("mu", "RW", includeRE = TRUE)
In the end, I'm fine either way.
A middle ground might be to allow your example input but warn user in docs and, specifically, in a printed note what the sampler assumes, particularly for things we don't fully check. Let me think on that particular aspect more.
Hi @paciorek @danielturek I am catching up here. A few thoughts:
Here's a variation on some of the things under discussion.
We could have a distinct sampler, perhaps: sampler_joint_effects
or sampler_noncentered_effects
or sampler_joint_RE
. In its setup
we then have various options for checking and what not. In run
it just calls out sampler_RW
. (We could also allow a user to ask for various types of samplers on the top node, e.g. RW or slice, if we want a version of this that does slice sampling for top nodes.)
The actual sampling could done in our current sampler(s), in particular sampler_RW
. Now that would mean that, as I have in the current PR, sampler_RW
has some code blocks that are new and only for this use case. I don't see a way of avoiding that given the code blocks are embedded in the sampler. But if we have the user interface go through a separate sampler, then I think it's clearer to users what the sampler is, and we don't need to modify documentation for our core samplers. Of course we could also start out with the embedded sampler being a duplicate of sampler_RW
but with the extra blocks and then combine into sampler_RW
at a later date if avoiding code duplication seems like the right thing to do after further consideration. For the moment, actually, maybe that is the way to go.
To @perrydv 's point about correctness, yes, agreed and very good point. Where I am leaning at the moment is that we put in all the calculations to ensure correctness but then if we can in a simple fashion check and be confident that we are in a situation where we can leave calculations out, then we leave them out. I will also see, in the CDFW case, how much time is added to do the extra calculations - it may well be minimal.
Based on this discussion and an offline discussion with Perry, here's what I've done:
sampler_noncentered
that has an embedded RW or slice sampler, depending on what user specified in its control
. The setup determines the dependent effects and passes them into setup for the embedded sampler.sampler_RW
and sampler_slice
to allow for joint sampling of target and set of dependent effects. This changes to code for these samplers and makes them less readable in terms of standard usage but avoids duplicated code. This choice can be revisited.sampler_noncentered
. Not yet done: writing documentation for sampler_noncentered
and setting up tests.
Ok, one update here. If we want to have a single sampler_RW
that has regular sampler and allows the noncentered variation, we'd need a nimbleFunctionList to handle the fact that in run code we need to call getParam(node, 'mean')
inside an if statement that is not active when using standard RW sampler.
At the moment this has tipped the balance into having our current sampler_RW
and a new sampler_RW_noncentered
and similarly for slice sampler. There will be a bunch of duplicated code. Could be revisited in the future.
@perrydv @danielturek (@kenkellner in case you're interested given we've used this in the CDFW comparisons), I've cleaned up the noncentered sampler, adding roxygen and testing, and fixing somethings revealed in testing.
I think this is close to finished. The roxygen deserves some attention as I spent some time trying to describe the sampler carefully, but more eyes would help.
@danielturek please weigh in now on what formatting changes you'd like.
@paciorek I looked over this carefully, and it looks great. This will be a nice addition.
@paciorek I am taking a fresh look at this as we run up to a release.
sampler_RW
. Since the outer object (sampler_noncentered
) can use getParam
to form the vector of relevant means if necessary, maybe it could just use a set_mean
method in sampler_RW
before running the sampler_RW
, i.e. provide the information from above. But again I recognize we are not going to shift implementation for this release and sorry to keep opening up new ideas.mean
parameter seems to preclude using this for a user-defined distribution. The same kind of issue came up recently in sampler_categorical
, with the difference that the getParam
calls were needed only in setup code. I'm thinking we could either remove this error-trap, or provide a control element to disable it, or try calling getParam in the setup code to see if there is a mean parameter. I feel moderately strongly that a user should be able to write a distribution with a mean
and be able to use this sampler.Also, how about shortening control list arguments:
samplerParam
to param
?samplerType
to either sampler
or type
?I've added a dynamic check for presence of a mean
parameter.
As far as simpler names, we can't use type
as that would conflict with the type='noncentered'
if the user passes control args via ...
. I'll change to sampler
and param
.
This modifies the RW sampler to allow for joint sampling of the target along with random effects whose mean is the target or whose sd is the target, i.e., a joint sampler for either
mu, b[1:p]
orsigma, b[1:p]
whereb[i] ~ dnorm(mu, sd = sigma)
.The sampler adds sampling in the noncentered parameterization, assuming that the model is set up for centered parameterization and samplers are already being applied in the centered parameterization. In particular the sampler proposes
mu*
(orsigma*
) and then deterministically setsb[i]*
to beb[i] - mu + mu*
ormu + (sigma*/sigma) (b[i] - mu)
. The Jacobian (in the latter case) is omitted because the code omits calculation of theb[i]
prior from the Metropolis ratio, which the Jacobian would cancel with.This is a special case (but covers much of the cases envisioned in the paper) of the ASIS sampling strategy of Yu and Meng (2011; 10.1198/jcgs.2011.203main). Hence naming the option 'ASIS'. That said, it's not very informative, but other options would probably be wordy (e.g, "joint_with_random_effects")
The main question here from an API perspective is whether we want this as a control option to the RW sampler. It is only relevant for some models and might be confusing to users. However, most of the code is identical to the RW sampler, so having as a separate sampler would involve a bunch of code duplication. @perrydv @danielturek, I would like feedback from you on this question.
Additional notes: 1.) TBD how much checking we do to ensure the random effect distribution is valid vs. telling the user what it should be and expecting them to check. 2.) It would be straightforward to extend to the case where
b[i] ~ dnorm(mu + eta[i] + gamma*z[i], sigma)
and the target is any ofmu
,gamma
, orsigma
. 3.) Adding an ASIS option to other scalar samplers, in particular slice, is straightforward (and I have draft code for it). Not sure if we want to do that too.We are currently using this sampler to explore sampling in MSOMs with species-level random effects.
I cancelled CI testing for this as there is no testing code and in principle this has no effect on existing use of RW sampler.