Higher quality (0, 1] floats

vks commented 1 year ago

Background

Motivation: It's possible to get higher-quality floats without having to add a loop.

Application: I don't have a concrete application, but this approach is able to generate floats < 2^-53, and does not generate 0, which should have a probability of 2^-1075. It can also generate more distinct floats than our current approach.

Feature request

Implement another (0, 1] distribution.

dhardy commented 1 year ago

So it uses a maximum of two steps, a bit like Canon's method. Might be generally preferable to #531, but probably still has a significant cost overhead?

At any rate, it may be worth investigating (implementing and benchmarking at least), but not something I'm going to put on my to-do list.

josephlr commented 1 year ago

Initial benchmarks for f64 on the OpenClosed01 distrubution (test distr_openclosed01_f64):

Existing int cast + multiply: 1,089 ns/iter (+/- 7) = 7346 MB/s
Implementation in the article: 1,528 ns/iter (+/- 17) = 5235 MB/s
Implementation w/o resampling the exponent: 1,312 ns/iter (+/- 15) = 6097 MB/s
Control, just casing u64 to f64: 991 ns/iter (+/- 10) = 8072 MB/s

EDIT: testing done on a Zen3 x86_64 processor, but I didn't pass -C target-cpu=native, so rep bsf was being used instead of tzcnt. Rerunning with -C target-cpu=native seemed to make all the microbenchmarks slower, even the existing implementation, which is odd.

Existing int cast + multiply: 1,217 ns/iter (+/- 25) = 6573 MB/s
Implementation in the article: 1,617 ns/iter (+/- 100) = 4947 MB/s
Implementation w/o resampling the exponent: 1,458 ns/iter (+/- 20) = 5486 MB/s
Control, just casing u64 to f64: 963 ns/iter (+/- 10) = 8307 MB/s

dhardy commented 1 year ago

Thanks. Overhead there is not negligible but is small enough that it could be offered as an alternative to the current implementation under a feature flag, if there is genuine interest in using it.

rust-random / rand

Higher quality (0, 1] floats #1346

Background

Feature request