Open crsh opened 3 years ago
Thanks for posting the issue, especially for digging into the root cause.
There's no way, in general, to avoid underflow and overflow in floating point. It seems like this problem may be fixable because it looks like the values you are trying to unconstrain originally came from samples. Those should have finite inverses if we get the numerics right. For arbitrary new inputs, there's no way to guarantee finiteness without harming other numerical properties (or at least we've never figured out how to do it---just putting finite upper or lower bounds leads to all kinds of problems with monotonicity and derivatives).
For the example code at hand, would it be possible to just use the original unconstrained variables rather than round tripping?
I'm pinging @paul-buerkner so he doesn't miss a brms issue.
Thank you for your input, Bob. I don't think this can be fixed from the brms side either as the relevant rstan functions are called inside bridgesampling and brms merely passes stuff to the corresponding stanfit method.
It seems like the fundamental problem is trying to go from unconstrained to constrained and then back again. That round trip should work, but apparently it doesn't here. But it doesn't look like there's an easy way to grab the unconstrained params. Maybe @bgoodri will have some idea.
@crsh: Do you have any idea which transforms are failing. If we can isolate the problem, we might be able to fix on the Stan side in our constrain/unconstrain code.
It seems like the fundamental problem is trying to go from unconstrained to constrained and then back again.
That was my impression as well.
@crsh: Do you have any idea which transforms are failing. If we can isolate the problem, we might be able to fix on the Stan side in our constrain/unconstrain code.
Well, in the case above I was under the impress that it was the atanh
-transform, but I assume this could come up in other transformations as well. I hit this problem during a simulation study and I've had a few more runs that errored (assumedly for the same reason), but the fit above was the only one I have been able to dig into thus far. I can try to inspect other fits, but I probably won't get around to it before the end of next week. I expect that if there is another parameter in my model that could cause issues, it would be the positively constrained standard deviations (e.g., when 0
is unconstrained via log(0)
).
We don't use atanh. The full set of transforms for constraints is documented in the Reference Manual chapter on Constraint Transforms. The inverse logit for upper and lower-bounded is documented in this section.
We use exp() for upper- or lower-bounded.
Where do the values being mapped come from? If they are constrained values for draws from the posterior, they should be mathematically invertible. If they're new values, then there may be no way to deal with the overflow.
One thing to do is regularize with priors to keep estimates from degenerating to boundaries.
We don't use atanh. The full set of transforms for constraints is documented in the Reference Manual chapter on Constraint Transforms.
Sorry, I'm a little confused. From what I understand, the section you link to lists the inverse hyperbolic tangent transform for cholesky-factorized correlation matrices. The variable L_1[2,1]
that is transformed to infinity is the off-diagnoal of a 2x2 correlation matrix. I also verfied that applying atanh
to other draws yields the same values as those returned by unconstrain_pars()
. What am I missing?
One thing to do is regularize with priors to keep estimates from degenerating to boundaries.
Good point. In my case, there's definitley room to make the LKJ-prior more informative.
lists the inverse hyperbolic tangent transform for cholesky-factorized correlation matrices.
I was the one who was confused. I thought we were talking about an upper-and-lower bounded scalar, for which we use inverse logit.
For a 2 x 2 correlation matrix, that off-diagonal in the Cholesky is the only free parameter, because the rows have to be unit length with positive diagonals. What must be happening is that it's running off to +/- 1 correlation, which is toward +/- inf on the unconstrained scale. So is the value being unconstrained equal to -1 or 1? If so, there's no finite transform. Regularization could help here.
The other way to parameterize a 2 x 2 correlation matrix is with a correlation constrained to be between [-1, 1],
parameters {
real<lower=-1, upper=1> rho;
}
transformed parameters {
corr_matrix[2] Sigma = [[1, rho], [rho, 1]];
}
model {
(rho + 1) / 2 ~ beta(a, b); // affine rescaling doesn't need Jacobian
The cost to factor a 2 x 2 covariance matrix is pretty trivial.
So is the value being unconstrained equal to -1 or 1?
Yes, it's exactly 1 and transforms to inf.
Thanks for the suggested parameterization. Given that samples this extreme occur only very rarely in my case, I think I will simply regularize by using a more informative prior so I can stick with the brms
interface for my simulations.
If there is no other way to address this, I wonder how bridgesampling
could address this issue on their end. Could the bridgesampler ignore draws that contain infinities? This will probably bias the estimates, but may not be a big problem as long as the number of discarded samples is low (in this case it's 1/120.000)?
I'm afraid I don't know. I think this'll have to be an issue for whichever package is using bridgesampling. It may be that they can bypass the round trip to constrained parameters. I'm pinging @paul-buerkner and @bgoodri and @jgabry who should have a better sense than me if and where this can be addressed.
Summary:
unconstrain_pars()
may return infinities for constrained samples that (possibly due to numerical problems) are on the bound of the parameter space. This can cause downstream problems, e.g., when attempting to use the unconstrained prameters for bridge sampling.Description:
I fit a model using
brms
and tried to estimate the model's marginal likelihood usingbridge_sampling()
, but got the following error:After some digging, I realized that the reason was an infinite value returned by
unconstrain_pars()
. Specifically, in one MCMC draw contains a correlation coefficient estimate of exactly 1 was backtransformed viaatanh()
toInf
.Reproducible Steps:
The model was fit using
cmdstanr
, so this requires a recent version ofbrms
. I use:Download the
stanfit
-object here.The following code reproduces the error in the bridge sample:
The following shows where the infinite unconstrained value arises:
Current Output:
Because the object returned by
unconstrain_pars()
does not give parameter names, I had to look up the parameter name in the constrained samples:Expected Output:
An appropriate finite value.
RStan Version:
2.21.2
R Version:
4.1.1 (2021-08-10)
Operating System:
Ubuntu 18.04