rust-random / rand

A Rust library for random number generation.
https://crates.io/crates/rand
Other
1.6k stars 423 forks source link

UniformFloat: allow inclusion of high in all cases #1462

Open dhardy opened 1 week ago

dhardy commented 1 week ago

Summary

Fix #1299 by removing logic specific to ensuring that we emulate a closed range by excluding high from the result.

Motivation

Fix #1299.

Additional motivation: floating-point types are approximations of continuous (real) numbers. Although one might expect that rng.gen_range(0f32..1f32) is strictly less than 1f32, it can also be seen as an approximation of the real range [0, 1) which includes values whose nearest approximation under f32 is 1f32.

In general, one can always expect that floating-point numbers may round up/down to the nearest representable value, hence this change is not expected to affect many users.

sample_single was changed (see below) as a simplification, and to match the new behaviour of new.

Details

Specific to floats (i.e. UniformFloat),

Note: new attempts to ensure that high may be yielded; sample_single_inclusive does not. This has not changed, but is a little surprising.

dhardy commented 1 week ago

I should add that there is still a loop in new and new_inclusive, but I don't think it can be a problem any more. This would require that low + (high - low) / (1 - ε) * (1 - ε) rounds to a value greater than high, but additionally that there are many representable values between scale = (high - low) / (1 - ε) and the next smallest value which will round down to high. Specifically, we know that high is representable.

@sicking @WarrenWeckesser

vks commented 1 week ago

Additional motivation: floating-point types are approximations of continuous (real) numbers. Although one might expect that rng.gen_range(0f32..1f32) is strictly less than 1f32, it can also be seen as an approximation of the real range [0, 1) which includes values whose nearest approximation under f32 is 1f32.

I don't like this argument, because when calculating probabilities with real numbers, there is no difference between [0, 1) and [0, 1] when integrating the PDF. Therefore, the distinction only makes sense for floats. Also, when using something advertised as [0, 1), I would want to be able to rely on it, because otherwise I have to take care of the corner cases with rejection sampling when e.g. taking logs. If we cannot guarantee it, we might as well deprecate [0, 1) and only offer [0, 1].

I agree we should fix #1462.

new sets scale = (high - low) / (1 - ε) then ensures that low + scale * max_rand <= high (unchanged)

I think you meant new_inclusive?

dhardy commented 1 week ago

I don't like this argument ...

With real numbers, both [0, 1] and [0, 1) have a range of 1, but the latter does not contain 1. With floats, it's pretty easy to lose precision — e.g. (1 - ε/2) + 1 → 2 — so does it make sense for us to take great care here? Will many users actually care that we do?

new and new_inclusive are part of trait UniformSampler and important to integers.

Meanwhile, we still provide Standard, OpenClosed01 and Open01 (but not Closed01). None of those are affected here.

I agree we should fix https://github.com/rust-random/rand/pull/1462.

I think you meant #1299?