Genentech / jmpost

https://genentech.github.io/jmpost/
17 stars 4 forks source link

Document that users need to set their own initial values for truncated distributions #268

Closed gowerc closed 3 months ago

gowerc commented 7 months ago

A common pattern in Stan is to use unconstrained distributions but then simply specify a constraint band within the stan code.

For example

real<lower=0> sigma;
sigma ~ cauchy(0, 5);

Our current method for initial value generation is to sample one from the prior distribution, however this doesn't currently respect the constraints placed on the value within the stan code which can lead to the model failing on the first pass e.g.

Running MCMC with 1 chain...

Chain 1 Rejecting initial value:
Chain 1   Error evaluating the log probability at the initial value.
Chain 1 Exception: lb_free: Lower bounded variable is -0.47888, but must be greater than or equal to 0.000000 (in '/var/folders/hs/gyg0q5g94pz917klnkg7tnt80000gq/T/RtmpXZWCT7/model-153173ae186df.stan', line 181, column 4 to column 45)
Chain 1 Exception: lb_free: Lower bounded variable is -0.47888, but must be greater than or equal to 0.000000 (in '/var/folders/hs/gyg0q5g94pz917klnkg7tnt80000gq/T/RtmpXZWCT7/model-153173ae186df.stan', line 181, column 4 to column 45)

Couple of questions arise from this:

danielinteractive commented 7 months ago

Wow, I was not aware that this is a recommendation for how to work in Stan. While sometimes truncated distributions are useful, I don't think we should encourage that on the jmpost level.

gowerc commented 7 months ago

Really interesting discussion about this here.

If I'm reading that right then the use of truncated distributions is valid as the sample only cares about likelihood up to proportionality of which the truncation doesn't impact that. However it does have a meaningful impact if used for the likelihood. Considering we don't use truncated distributions for the Likelihood I don't think that last point impacts us.

@danielinteractive - I'm guessing from your comment then we would just be pushing users to set their own initial values if they want to use truncated distributions then ?

danielinteractive commented 7 months ago

Yes exactly. Generally I would not encourage / give examples for this and instead recommend / have examples where the support of the distribution is identical with the constraints.

gowerc commented 3 months ago

Re-opening this as the point came up again that this is not overly intuitive behaviour especially given that the Stan team actively recommend users to use half-cauchy distributions for variance parameters. At the very least we should probably support >0 constraints