lbeta is unstable when the arguments differ a lot

stan-dev / math

The Stan Math Library is a C++ template library for automatic differentiation of any order using forward, reverse, and mixed modes. It includes a range of built-in functions for probabilistic modeling, linear algebra, and equation solving.

https://mc-stan.org

BSD 3-Clause "New" or "Revised" License

745 stars 187 forks source link

lbeta is unstable when the arguments differ a lot #1611

Closed martinmodrak closed 4 years ago

martinmodrak commented 4 years ago

Description

When the arguments are on wildly different scales, lbeta suffers from catastrophic cancellation.

Example + Expected output

https://www.wolframalpha.com/input/?i=LogGamma%2810%5E20%29+%2B+LogGamma%283%29+-+LogGamma%2810%5E20+%2B+3%29

lbeta(1e20, 3)
Stan:         0
Mathamatica: -137.46195839908279

https://www.wolframalpha.com/input/?i=LogGamma%283*10%5E15%29+%2B+LogGamma%2812895%29+-+LogGamma%283*10%5E15+%2B+12895%29

lbeta(3e15, 12895)
Stan:        -350384
Mathamatica: -350396.98895556212

Current Version:

v3.0.0

bob-carpenter commented 4 years ago

Nice catch.

Is there a known fix here? We're currently using this:

lgamma(a) + lgamma(b) - lgamma(a + b);

martinmodrak commented 4 years ago

Fixed along #1592 in #1497 (if it is preferable, I might split this fix from the #1497 PR into a separate one , but so far it is IMHO all related as failures in neg_binomial_2_lpmf motivated the improvements for binomial_coefficient_log and lbeta).

It is a neat trick I found in the R codebase - you first compute the lbeta via Stirling approximation to lgamma which allows you to analytically manipulate the expression and avoid the cancellation for the "bulk" of the value. Then you compute the difference between lgamma and the Stirling approximation and add this to the result to restore precision.

The code in R for the difference between lgamma and Stirling was opaque and used some tricks I didn't really get, but it turns out just using 3 additional terms of the Stirling series as found on Wikipedia is enough for all the tests I could come up with :-)

mcol commented 4 years ago

Assuming it doesn't take an inordinate amout of effort, having it separate makes it easier to review, and will help if in the future this needs to be revisited for whatever reason.

bob-carpenter commented 4 years ago

Thanks for the explanation. This numerics stuff is tricky.

On Jan 13, 2020, at 2:15 PM, Martin Modrák notifications@github.com wrote:

Fixed along #1592 in #1497 (if it is preferable, I might split this fix from the #1497 PR into a separate one , but so far it is IMHO all related as failures in neg_binomial_2_lpmf motivated the improvements for binomial_coefficient_log and lbeta).

It is a neat trick I found in the R codebase - you first compute the lbeta via Stirling approximation to lgamma which allows you to analytically manipulate the expression and avoid the cancellation for the "bulk" of the value. Then you compute the difference between lgamma and the Stirling approximation and add this to the result to restore precision.

The code in R for the difference between lgamma and Stirling was opaque and used some tricks I didn't really get, but it turns out just using 3 additional terms of the Stirling series as found on Wikipedia is enough for all the tests I could come up with :-)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

martinmodrak commented 4 years ago

So, I separated this into PR #1612, should be ready for review.