JuliaStats / Distributions.jl

A Julia package for probability distributions and associated functions.
Other
1.12k stars 419 forks source link

Request for comments: Partially-sampled random numbers and other algorithms for arbitrary-precision sampling from distributions #1202

Open peteroupc opened 4 years ago

peteroupc commented 4 years ago

I want to inform you of a concept called partially-sampled random numbers (PSRNs), which represent random numbers of an arbitrary precision, but whose contents are sampled only as necessary. PSRNs are useful for implementing algorithms to sample from continuous distributions with arbitrary precision and without relying on floating-point arithmetic. I thus believe PSRNs may be useful to have in this library, or should at least be given some consideration.

A partially-sampled random number stores a sign, an integer part, and a fractional part with an arbitrary number of digits. It's a pure data structure; algorithms that operate on PSRNs are expected to provide (or access) an RNG separately from the PSRN data structure.

I have written a page describing PSRNs in detail, as well as algorithms to sample PSRNs and certain continuous distributions, such as the beta distribution and exponential distribution.

As you can see, supporting arithmetic and comparisons with PSRNs is anything but trivial. Neither is it trivial to describe an arbitrary-precision sampler for popular distributions, but I have done so for the beta, Rayleigh, and logistic distributions, and the page also has algorithms for operations PSRNs -- and describing the latter algorithms was far from easy to do. Moreover, I am not a Julia programmer (so I can't provide a Julia implementation of these algorithms for now), but I want to make the PSRN concept better known, including here where the focus is on non-uniform random number generation. This is one of the reasons I have written the PSRN article (as well as other articles on random sampling, including Bernoulli factory algorithms, the uniform sum/ratio distribution, and other arbitrary-precision samplers); another is to encourage others to develop new algorithms and new implementations for arbitrary-precision "exact" sampling (as Karney, for example, has done with the normal distribution in 2014-2016).


To go into a little more detail:

A PSRN is a pure data structure: it contains only an integer part, a fractional part, and a sign. Algorithms that operate on PSRNs are expected to provide (or access) an RNG separately from the PSRN data structure.

For example, say the PSRN stores base-10 digits. It could thus be expressed as follows:

  1. A 32-bit integer part.
  2. A vector of digits making up the fractional part. For instance, the vector could store {1, 5, 7} and thus represent a random number in the interval [0.157, 0.158].
  3. A sign (a single bit that means either negative or non-negative). It may or may not be stored separately from the integer part.

Note the absence of an RNG here. As an example, say the PSRN stores a non-negative sign, an integer part of 98, and a fractional part {1, 5, 7}. It thus represents a random number in the interval [98.157, 98.158].

peteroupc commented 3 years ago

In the meantime I have updated the partially-sampled random number (PSRN) article and numerous other open-source articles on random variate generation, including a page with more algorithms that operate on PSRNs. In these articles I put an emphasis on random variate generation:

As a reminder, a PSRN consists of an integer part, a sign, and a fractional part containing an arbitrary number of digits.

peteroupc commented 1 year ago

I have updated my open-source variate generation articles.

See peteroupc/peteroupc.github.io#18.

They describe numerous algorithms on generating random variates (both from common and general distributions) with arbitrary precision and sampling Bernoulli distributions exactly. The audience is computer programmers with mathematics knowledge, but little or no familiarity with calculus, and this Julia project is a good vehicle for programmers to try their hand at writing code for those algorithms.

The articles follow, and here I would like review on how easy they are to implement in Julia, among other things.