boostorg / math

Boost.org math module
http://boost.org/libs/math
Boost Software License 1.0
306 stars 220 forks source link

New special functions: logarithm of incomplete gamma and beta functions #1173

Open dschmitz89 opened 1 month ago

dschmitz89 commented 1 month ago

The incomplete gamma and beta functions arise in a LOT of cumulative density functions. Very accurate implementations of them are quite common and also in Boost. What I have not seen yet are implementations of their logarithms. These would be very helpful to compute accurate logarithms of CDF and SFs. One example is an issue in SciPy where a user wants to compute the logcdf of the Poisson distribution for not even very extreme values: https://github.com/scipy/scipy/issues/8424. Here the logarithm of the incomplete regularized gamma function would come very much in handy.

SciPy includes the logarithm of the normal distribution CDF already but incomplete beta and gamma functions are of course a larger beast to kill.

Is this something you would find worthwile for Boost?

Edit: CC ing the special function wizards @steppi @WarrenWeckesser

mborland commented 1 month ago

The incomplete gamma and beta functions arise in a LOT of cumulative density functions. Very accurate implementations of them are quite common and also in Boost. What I have not seen yet are implementations of their logarithms. These would be very helpful to compute accurate logarithms of CDF and SFs. One example is an issue in SciPy where a user wants to compute the logcdf of the Poisson distribution for not even very extreme values: scipy/scipy#8424. Here the logarithm of the incomplete regularized gamma function would come very much in handy.

SciPy includes the logarithm of the normal distribution CDF already but incomplete beta and gamma functions are of course a larger beast to kill.

Is this something you would find worthwile for Boost?

We certainly wouldn't turn them down. A while back @mdhaber asked about adding the logpdf and logcdfto the distribution so many were done (e.g. https://github.com/boostorg/math/blob/develop/include/boost/math/distributions/laplace.hpp#L235), but I think your proposal would help us both out. Right now the poisson dist simply uses: log(cdf(...))).

We have logsumexp here: https://github.com/boostorg/math/blob/develop/include/boost/math/special_functions/logsumexp.hpp, and continued fractions here: https://github.com/boostorg/math/blob/2a4351eb80a18c9abb5a40b96917df073f0939ae/include/boost/math/tools/simple_continued_fraction.hpp and here: https://github.com/boostorg/math/blob/2a4351eb80a18c9abb5a40b96917df073f0939ae/include/boost/math/tools/centered_continued_fraction.hpp depending on the route of implementation you're looking at.

Let us know what kind of assistance you need.

jzmaddock commented 1 month ago

Internally, the incomplete gamma and beta's are mostly computed as the product of an asymptotic part and a power term (I say mostly, because there are many code paths depending on the arguments). So in principle, calculating the log is relatively straightforward, it's just untangling all those code paths.

I might be tempted to do what I did for 1F1, and pass a scaling term down through all the calculations - the scaling term is an integer (so no calculation error at least in principle), and we then scale the final result by e^N, or of course, take logs if we're going that way.

BTW we already have the logpdf for this one correctly handled.

I also wonder if your original issue was motivated by the desire to get the complement of the CDF? ie he's asking for an accurate logcdf when the cdf is near 1. Why? To pass into expm1 to get the complement maybe? But I digress!

dschmitz89 commented 1 month ago

Thanks for your feedback, sounds like you agree that these would be useful. I am personally not qualified to help out as I am not a C++ or special function expert though. Let's see if we can find a champion for this one :).

About the Poisson logpdf: do you list log methods in the distribution docs? For Poisson, I cannot find it and the general distribution page does not list it either. In general if log methods are implemented, even generically without high precision, I would expect to find that info somewhere.

jzmaddock commented 1 month ago

I think @mborland added those, did the docs get updated at the same time?

mborland commented 1 month ago

I think @mborland added those, did the docs get updated at the same time?

Apparently not so I'll get them merged in the next few days. The short of it is all distributions have logpdf and logcdf minimally through generic functions (e.g. log(cdf(...))) and some have one or both specialized.

mborland commented 1 month ago

Thanks for your feedback, sounds like you agree that these would be useful. I am personally not qualified to help out as I am not a C++ or special function expert though. Let's see if we can find a champion for this one :).

About the Poisson logpdf: do you list log methods in the distribution docs? For Poisson, I cannot find it and the general distribution page does not list it either. In general if log methods are implemented, even generically without high precision, I would expect to find that info somewhere.

See: https://github.com/boostorg/math/pull/1176