Implement an exact Bernoulli distribution

Popog commented 2 years ago

Background

The current implementation has a probability limit of 1 / (2^32-1) for the integer constructor and a limit of 1 / (2^64) for the f64 constructor. Assuming a couple nanoseconds per flip, the former's precision limitation could become visible even with only a single core in a matter of seconds. The latter is certainly more robust, taking at least a thousand or so processor-years, but even then distributed computing environments could conceivably reach this precision limitation within a few months.

Feature request

Implement an exact Bernoulli distribution. I have created an implementation which I'd be glad to send out a PR to integrate. Mostly looking for feedback if that is desirable before doing so.

dhardy commented 2 years ago

Thanks for the suggestion and offer. This comes back to #494 (a little outdated; my latest thinking is to create a new crate for high-precision implementations).

Also see #531, #411, #491.

peteroupc commented 1 year ago

There are different approaches to use with irrational Bernoulli probabilities than with rational Bernoulli probabilities. Rational probabilities can be handled by generating an integer with the range of the denominator and then comparing. Your repository could work as well, but I haven't reviewed it enough to say whether it supports arbitrary precision. Irrational probabilities require specialized algorithms, and I discuss many of them in "Bernoulli Factory Algorithms".

dhardy commented 1 year ago

To sample with perfect precision, yes. But if some level of bias is considered acceptable (e.g. no more than 1 in 2^48 samples affected, as used in #1287), then some simplification should be possible? (For this, is just using f64 enough?)

peteroupc commented 1 year ago

To sample with perfect precision, yes. But if some level of bias is considered acceptable (e.g. no more than 1 in 2^48 samples affected, as used in #1287), then some simplification should be possible? (For this, is just using f64 enough?)

For "biased" Bernoulli sampling, better to find a rational approximation of the Bernoulli probability in question, and for an approximation error of $2^{48}$ a binary64 floating-point number can store the approximation (namely $N/2^{48}$ where $N$ is an integer with $0\le N\le 2^{48}$) without loss, given that it stores a 53-bit binary significand. (I prefer storing the approximation in an integer if possible, though.)

By the way:

The following work discusses approximate sampling for more general discrete distributions: Saad et al., "Optimal Approximate Sampling From Discrete Probability Distributions", arXiv:2001.04555 [cs.DS], also in Proc. ACM Program. Lang. 4, POPL, Article 36 (January 2020).
If you're interested I could mention exact and approximate algorithms for the Bernoulli probability $f(p)$ where $p$ is an unknown Bernoulli probability.

rust-random / rand

Implement an exact Bernoulli distribution #1193

Background

Feature request