stdlib-js / stdlib

✨ Standard library for JavaScript and Node.js. ✨
https://stdlib.io
Apache License 2.0
4.35k stars 440 forks source link

Logarithm of the Cumulative Distribution Function for a Student's t distribution. #2528

Open soegaard opened 3 months ago

soegaard commented 3 months ago
  1. The formula for F(x;nu) in the documentation is only valid for x>0.
image

Wikipedia uses t (not x), but gives the same formula, but for t>0 only:

image

From "Sampling Student’s T distribution – use of the inverse cumulative distribution function" by William T. Shaw:

image

And the same formula is returned by wolframscript: image

  1. There is a problem in the implementation of logcdf. The implementation basically computes the normal cdf-probability and then takes the logarithm. This defeats the purpose of having a logcdf, which is meant to be used when cdf(x) is so small it is rounded to 0.

    The formula for cdf uses the incomplete beta function, which in turn uses beta. Your project includes an betaln which computes the logarithm of a beta value. A solution must be to use betaln to implement an "logarithm of a regularized, incomplete beta function" and use that to implement logcdf.

stdlib-bot commented 3 months ago

:wave: Hi there! :wave:

And thank you for opening your first issue! We will get back to you shortly. :runner: :dash:

kgryte commented 3 months ago

@Planeshifter Would you mind taking a look?

Planeshifter commented 3 months ago

Hi @soegaard,

Thanks for opening this issue!

You are correct that the displayed formula is only valid for x > 0, which is missing from the respective README.mds. It's probably best to change them to instead display the formula you referenced. I will take care of this shortly and post an update once it is done.

Your second point with regards to the logcdf is fair as well; it would be good to have a dedicated implementation that avoids rounding errors. However, this may require some R&D as betainc's underlying implementation is rather complex (see kernel-betainc.)

soegaard commented 3 months ago

Just thinking out loud:

In Mathematica the regularized incomplete Beta function is called BetaRegularized.

https://reference.wolfram.com/language/ref/BetaRegularized.html

We need the logarithm, so we can use a combination of FunctionExpand and PowerExpand to get a formula (using the free wolframscript):

In[7]:= PowerExpand[FunctionExpand[Log[BetaRegularized[z,a,b]]]]                                                                                                              
Out[7]= Log[Beta[z, a, b]] - Log[Gamma[a]] - Log[Gamma[b]] + Log[Gamma[a + b]]

At first sight, I thougt this meant that betaln could be used. However, Beta[z, a, b] is the incomplete beta function, so to compute Log[Beta[z, a, b]] we need (like you concluded) an betaincln in stdlib-js.

That is indeed tricky to implement.

The reason I noticed logcdf in the first place is that I am implementing logcdf for the Racket math library and wanted to see how other libraries did it.

FWIW the implementation of "betaincln" in the Racket library is here (the flag log? switches between betaincln and betainc):

https://github.com/racket/math/blob/master/math-lib/math/private/functions/incomplete-beta.rkt#L192

I do not know the details of the algorithm used. But the strategy used it to give all functions involved in computing "betainc" a flag log?. Then (very oversimplified) the flag is used to determine whether to compute a sum or a product.

kgryte commented 1 month ago

Related discussion happening on Boost: https://github.com/boostorg/math/issues/1173