Closed martinmodrak closed 4 years ago
Is it possible to pin down the part of math/stan/math/prim/prob/neg_binomial_2_lpmf.hpp
where the numerical problems occur with large phi? My guess is that it's in computing the binomial coefficient, which you seem to be doing by computing the logarithms of the factorials using lgamma
. Can the problem be just numerical inaccuracy of lgamma
with large input? Are you using Stirling's approximation or something to compute the logarithm of the gamma function?
Here is my idea:
@jtimonen You are correct, my work with the non-log version in #1497 shows that the Poisson branch can be eliminated with a better implementation of binomial_coefficient_log
- this better implementation is almost ready in #1614 and so all this should bubble up eventually also here. Note that my experience with reimplementing lbeta
is that Stirling approximation is not enough in and of itself and corrections are needed (see #1612) to maintain precision when phi
is large and x
is small.
Sorry that I wasted your time as this connection was not apparent in the issue here.
If I read this right, then you are suggesting to improve the lgamma implementation... I do doubt that there are issues of these sort of. We are using the lgamma_r
implementation as a default and fall back to boost::math::lgamma
on Windows. I have stared a lot at the boost thing and I would be relatively surprised if there is any issue with large arguments or whatever. This thing is coded with great care.
So before you embark on this, please test the lgamma if it is really not precise enough.
@wds15 I do not suggest to improve lgamma
. The trick I mention concerns differences of lgamma
calls for large arguments where precision might be lost to catastrophic cancellation (e.g. lbeta
, binomial_coefficient_log
). See #1612 for more details.
I see. Being smart about sums of lgamma sounds like a good approach. Sorry.. I am out of context.
This is super tedious work you are looking into here. So much appreciated.
Some people quoted me for knowing these things... not really, I just have dealt with a few of these hiccups whenever I had to.
My previous post had an mistake already in the definition of NB pmf. I am just leaving a (hopefully) correct version here If anyone is wondering. I had to derive this anyway so it wasn't waste of time
Description
For some values,
neg_binomial_2_log_lpmf
would return values larger than 0 (e.g.neg_binomial_2_log_lpmf(39, 4, 6e16) == 256
) for other, it just returns noticeably different values than the correspondingneg_binomial_2_lpmf
call. In all cases this is related to largephi
values (noticeable discrepancies manifest already at aroundphi = 2e5
).This seems to be related to/a regression of #428
Example
neg_binomial_2_log_lpmf(39, 4, 6e16) == 256
. I've created a failing test on the branch: https://github.com/stan-dev/math/compare/develop...martinmodrak:bugfix/1495-neg_binomial_2_log_large_phiExpected Output
The same as
neg_binomial_2_lpmf(39, exp(4), 6e16)
, which is-5.22991
Current Version:
v3.0.0