stan-dev / math

The Stan Math Library is a C++ template library for automatic differentiation of any order using forward, reverse, and mixed modes. It includes a range of built-in functions for probabilistic modeling, linear algebra, and equation solving.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
745 stars 187 forks source link

neg_binomial_2_log_lpmf is not stable for large phi #1495

Closed martinmodrak closed 4 years ago

martinmodrak commented 4 years ago

Description

For some values, neg_binomial_2_log_lpmf would return values larger than 0 (e.g. neg_binomial_2_log_lpmf(39, 4, 6e16) == 256) for other, it just returns noticeably different values than the corresponding neg_binomial_2_lpmf call. In all cases this is related to large phi values (noticeable discrepancies manifest already at around phi = 2e5).

This seems to be related to/a regression of #428

Example

neg_binomial_2_log_lpmf(39, 4, 6e16) == 256. I've created a failing test on the branch: https://github.com/stan-dev/math/compare/develop...martinmodrak:bugfix/1495-neg_binomial_2_log_large_phi

Expected Output

The same as neg_binomial_2_lpmf(39, exp(4), 6e16), which is -5.22991

Current Version:

v3.0.0

martinmodrak commented 4 years ago

1496 seems related as the phi cutoff for when to delegate to Poisson is probably not correctly chosen.

jtimonen commented 4 years ago

Is it possible to pin down the part of math/stan/math/prim/prob/neg_binomial_2_lpmf.hpp where the numerical problems occur with large phi? My guess is that it's in computing the binomial coefficient, which you seem to be doing by computing the logarithms of the factorials using lgamma. Can the problem be just numerical inaccuracy of lgamma with large input? Are you using Stirling's approximation or something to compute the logarithm of the gamma function?

jtimonen commented 4 years ago

Here is my idea:

nb_stirling
martinmodrak commented 4 years ago

@jtimonen You are correct, my work with the non-log version in #1497 shows that the Poisson branch can be eliminated with a better implementation of binomial_coefficient_log - this better implementation is almost ready in #1614 and so all this should bubble up eventually also here. Note that my experience with reimplementing lbeta is that Stirling approximation is not enough in and of itself and corrections are needed (see #1612) to maintain precision when phi is large and x is small.

Sorry that I wasted your time as this connection was not apparent in the issue here.

wds15 commented 4 years ago

If I read this right, then you are suggesting to improve the lgamma implementation... I do doubt that there are issues of these sort of. We are using the lgamma_r implementation as a default and fall back to boost::math::lgamma on Windows. I have stared a lot at the boost thing and I would be relatively surprised if there is any issue with large arguments or whatever. This thing is coded with great care.

So before you embark on this, please test the lgamma if it is really not precise enough.

martinmodrak commented 4 years ago

@wds15 I do not suggest to improve lgamma. The trick I mention concerns differences of lgamma calls for large arguments where precision might be lost to catastrophic cancellation (e.g. lbeta, binomial_coefficient_log). See #1612 for more details.

wds15 commented 4 years ago

I see. Being smart about sums of lgamma sounds like a good approach. Sorry.. I am out of context.

This is super tedious work you are looking into here. So much appreciated.

Some people quoted me for knowing these things... not really, I just have dealt with a few of these hiccups whenever I had to.

jtimonen commented 4 years ago

My previous post had an mistake already in the definition of NB pmf. I am just leaving a (hopefully) correct version here If anyone is wondering. I had to derive this anyway so it wasn't waste of time

Screenshot 2020-02-05 at 1 39 33