nimble-dev / nimble

The base NIMBLE package for R
http://R-nimble.org
BSD 3-Clause "New" or "Revised" License
155 stars 22 forks source link

Still want support for both <- and ~ nodes in a model #745

Open danielturek opened 6 years ago

danielturek commented 6 years ago

Still desire for support of nodes the follow both a stochastic and a deterministic declaration in the model:

I've needed this once before for a project, and did a workaround using a custom distribution. I now find myself needing it again, and once again have to explain the only way to do this is via a custom distribution, to accomplish both the deterministic calculations and the likelihood evaluations.

Note that WinBUGS does support this.

e.g.,

a ~ dnorm(0, 1)
b ~ dnorm(0, 1)
c <- a+b
c ~ dwhatever(...)
perrydv commented 6 years ago

Is there a case where this is not an artificial construction? If a and b each have a density, their sum can't have a separate density. IIRC, JAGS does not support this. I've also had users ask about such a need, but it has usually reflected incorrect thinking about their problem.

danielturek commented 6 years ago

Yes, certainly. In an earlier project I was working on (about 2 years ago now) they were modelling manatee deaths. Some deaths due to different causes were known, and we could count them. Some deaths were due to unknown causes, but those were modelled stochastically. The sum of the unknown and unknown numbers of deaths, then was assigned to follow some other distribution (again one component of that sum was observed counts, and the other part of that sum was being modelled) as a function of the entire population size and the death rate (that is, the total number of deaths, those we saw, and those we didn't see, is some stochastic function of the total population size and the death rate).

In this case, the total number of individuals in each particular area is a derived quantity based on limited observations. Then, again, this total number of individuals is also modelled as a stochastic function, being a function of the environmental factors, habitat, etc, at each location.

Anyway, yes, there do seem to be compelling (not stupid and contrived) use cases for this.

I did a little thinking about how I think it should work:

deterministic nodes

stochastic nodes

deterministic and stochastic nodes

Of course, there would have to be special processing to allow for 2 definitions of such a node.

And I'm not sure the right handling of things like isStoch, but my initial thought is that it should be handled mostly like a stochastic node. isStoch = TRUE, and getDependencies stops there, unless downstream = TRUE.

Just some thoughts. But I really would like to see this supported.

I know JAGS did not support this, but that WinBUGS does.

paciorek commented 6 years ago

Adding this comment from defunct 'core feature ideas' wiki:

This would permit combinations of model nodes to be specified as following a particular distribution. For example, perhaps we wish to say the sum of two model components follows a normal distribution. One of these components is observed data, and the other is a latent model variable. Yes, this simple example is otherwise tractable because of the normal distributions, but hopefully it makes the point:
code <- nimbleCode({
x ~ dnorm(b0 + b1*c1 + b2*c2, sd = sigma_x)
y <- x + observedData
y ~ dnorm(mu, sd = sigma_y)
})