classify functions for numerical instability

alashworth commented 5 years ago

Issue by bob-carpenter Thursday Jul 24, 2014 at 19:22 GMT Originally opened as https://github.com/stan-dev/stan/issues/801

Flag naive implementations (like incomplete beta derivatives) in the manual.

Ideally, characterize at what points in the domain functions are stable for.

Make sure that iterative functions have finite bounds and throw exceptions if they don't converge.

alashworth commented 5 years ago

Comment by betanalpha Thursday Jul 24, 2014 at 20:33 GMT

Basically anything that used my fragile implementations of the hypergeometric functions, 3F2 and 2F1.

[ ] src/stan/prob/distributions/univariate/continuous/beta.hpp
[ ] src/stan/prob/distributions/univariate/continuous/student_t.hpp
[ ] src/stan/prob/distributions/univariate/discrete/beta_binomial.hpp
[ ] src/stan/prob/distributions/univariate/discrete/neg_binomial.hpp

Some effort should go into testing my implementation for the gradients of the incomplete gamma, too,

[ ] src/stan/prob/distributions/univariate/continuous/inv_chi_square.hpp
[ ] src/stan/prob/distributions/univariate/continuous/scaled_inv_chi_square.hpp
[ ] src/stan/prob/distributions/univariate/continuous/inv_gamma.hpp

alashworth commented 5 years ago

Comment by bob-carpenter Thursday Jul 24, 2014 at 21:51 GMT

I'd add

[ ] all the complementary cumulative distribution functions (ccdf) which are now implemented as 1-cdf, for values where cdf is near 1
[ ] operator- when numbers are too close to each other (the above problem is due to this and the naive implementation of ccdfs as 1 - cdf); similarly for operator+ when one input is negative and one positive and they're close in absolute value
[ ] exp when inputs are too large (> 800) or too small (< -800), which lead to overflow to +infinity and underflow to 0 respectively

It'd be nice if we could make some of the functions better behaved on the log scale, because what we really need is log_cdf. The problem is that the derivatives involve 1/cdf, so unless things cancel, we're stuck having no more precision than for cdf itself in terms of gradients.

[ ] the basic cov_matrix and corr_matrix data types when there are too many dimension (maybe this is going to be better now that we're not checking symmetry but just using the lower triangular portions --- that did go in, didn't it?)

alashworth commented 5 years ago

Comment by aadler Thursday Jul 24, 2014 at 21:55 GMT

Michael, just in case you haven't seen these, they may prove helpful:

alashworth commented 5 years ago

Comment by betanalpha Friday Jul 25, 2014 at 08:10 GMT

It'd be nice if we could make some of the functions better behaved on the log scale, because what we really need is log_cdf. The problem is that the derivatives involve 1/cdf, so unless things cancel, we're stuck having no more precision than for cdf itself in terms of gradients.

Technically the gradients require grad(cdf) / cdc which is another one of those functions that really should be computed directly (a la the polygamma functions) if we wanted to be really robust.

The strategy would involve someone familiar with

a) Implementing recursion relations (most of these functions have nice analytic continuations) b) Computing asymptotic expansions or Taylor series (asymptotic expansions seem to be more efficient) c) Given a series for f(x) or 1 + f(x), finding a good series approximation to log(f(x)) or log(1 + f(x)). d) Pade approximants and rational functions for developing efficient approximations to ratios of functions.

alashworth / test-issue-import

classify functions for numerical instability #41