nimble-dev / nimble

The base NIMBLE package for R
http://R-nimble.org
BSD 3-Clause "New" or "Revised" License
160 stars 24 forks source link

handle predictive nodes in sampler_CRP #1402

Closed paciorek closed 10 months ago

paciorek commented 10 months ago

This sets predictive nodes to be included in dependencies for sampler_CRP, which for logistical reasons and to ensure validity of sampling, imposes some requirements in terms of the number of dependent nodes for each stochastic index being sampled.

@danielturek a quick eye in terms of where I am setting the nimble option would be helpful.

danielturek commented 10 months ago

@paciorek The logic makes sense - and the location of the added code seems ok, also.

My only possible holdup is that, with this change, we can have the situation where some samplers are including PP nodes in dependency calculations, while simultaneously other samplers are excluding PP nodes. Can this lead to incorrect inferences? I'm not 100% certain either way, but it seems very plausible to me that having part of the model (or, updating of part of the model) operate conditional on PP nodes as data, and other parts of the model being updated without conditioning on PP nodes could lead to incorrect inferences.

Is a confident answer to that ("yes, it's ok", or "no, there's in fact a problem with that") obvious to you?

paciorek commented 10 months ago

Very good point - I should have thought of that and thanks for thinking carefully about this. I'm not confident.

One possibility is to check in configureMCMC whether any CRP nodes are present and there are any predictive nodes and then turn MCMCusePredictiveDependenciesInCalculations on. I think we should then leave it on, and tell the user, since we want any use of addSampler to also have MCMCusePredictiveDependenciesInCalculations be TRUE.

However, I think that is probably too clever (in that there are probably cases where it wouldn't be clever enough), and we should simply error out at the beginning of configureMCMC and tell the user if they want to set up an MCMC on such a model they should set MCMCusePredictiveDependenciesInCalculations themselves and rerun configureMCMC.

Thoughts?

danielturek commented 10 months ago

@paciorek This is a bit of a rabbit-hole, because the root cause issue was unequal-sized groups of CRP cluster nodes triggering an error, which itself was caused by some CRP dependencies being data and some not being data (and hence being PP; and thereafter excluding the non-data-PP nodes caused unequal groups).

So the situation seems more complex:

I've just sat for a minute, and tried to think about what's reasonable behavior here. These are the best thoughts I came up with, which are not extremely helpful:

  1. So long as we don't support the unequal group sizes, one idea is: if this "unequal group size error" is triggered, maybe giving additional information in the error message, to add: "this could be a result of the MCMC excluding PP nodes from sampler calculations; using MCMCusePredictiveDependenciesInCalculations = TRUE may resolve this problem"
  2. The root-cause-fix would be adding the logic, data structures, and indexing to support CRP cluster groups of unequal sizes.
  3. Per this discussion, it's possible that a user is presently running an MCMC which is mathematically incorrect.
  4. ?
paciorek commented 10 months ago

@danielturek thanks for helping me think this through; this is helpful.

I agree that my proposed solution is conservative in that it may not be needed in some cases. So maybe the solution is to refine my proposed trapping behavior so that configureMCMC looks for (1) a model with a dCRP node and (2) if there are any predictive nodes that are children of the "clusterNodes" (the nodes that are cluster parameters). If so, error out with a message saying that the MCMC might be feasible if the user sets the nimbleOption.

As far as what induces what, we really are expecting that there are equal group sizes and that the likely way that wouldn't happen currently is if some of the dependent nodes for a given group are predictive nodes. Other situations such as you mention are possible in principle but would require unusual model structures and should (fingers crossed) cause errors to be produced via all the checking in sampler_CRP setup code.

What do you think of my revised proposal?

danielturek commented 10 months ago

@paciorek This makes good sense to me. So:

This seems reasonable to me. Do you want to make this change?

paciorek commented 10 months ago

Yes, I will do so.