rust-random / rand

A Rust library for random number generation.
https://crates.io/crates/rand
Other
1.67k stars 431 forks source link

rng.gen_range(0..v.len()) is not portable across usize sizes #1415

Closed banool closed 4 months ago

banool commented 8 months ago

I just realized that unfortunately ChaCha8Rng is not portable, it returns different values for wasm32-unknown-unknown than aarch64-apple-darwin for example. Alas I've already used it in a project where I can't undo it. Could we add something to the CSPRNGs section of the docs that states which of these generators (if any) are portable, similar to how that is mentioned in the previous table? https://rust-random.github.io/book/guide-rngs.html#cryptographically-secure-pseudo-random-number-generators-csprngs.

banool commented 8 months ago

Alternatively I realize that perhaps it's actually the fact that I'm generating floats which is introducing the inconsistency. In any case, clarifying the docs would help folks in a situation like this narrow down the problem.

dhardy commented 8 months ago

The RNG (output of RngCore methods) is not portable? Then that is a bug; please report.

dhardy commented 8 months ago

Floating-point numbers are unfortunately not completely portable. If I remember correctly, SSE2 helps with portability on x86 CPUs. I don't know about WASM and AArch64. In fact I don't even know if WASM is precise about portability of float results; looks like it just refers to IEEE 754 without even specifying that standard's version.

banool commented 8 months ago

I'm using code like this:

let rng = ChaCha8Rng::seed_from_u64(seed);
let sky_colors = [
    Color::rgb_u8(255, 202, 140),
    Color::rgb_u8(211, 255, 154),
    Color::rgb_u8(71, 224, 226),
];
let sky_color = sky_colors[rng.gen_range(0..sky_colors.len())];

On aarch64-apple-darwin:

Seed 2093633454911051196
Sky color: Rgba { red: 0.2784314, green: 0.8784314, blue: 0.8862745, alpha: 1.0 }

On wasm32-unknown-unknown:

Seed 2093633454911051196
Sky color: Rgba { red: 0.827451, green: 1.0, blue: 0.6039216, alpha: 1.0 }

For the curious, the full code is here: https://github.com/banool/aptos-summits/blob/07568cabbcdcf87ab8f0e3709cdfa08377639fc5/art/artcore/src/lib.rs#L111.

For a non portable RNG I would expect this, but I'm not sure if ChaCha8 (I'm using version 0.3.1) is portable not, which is why I opened this docs issue. Weird thing is, I'm not using any floats here, but it could also be the usize problem it warns about here: https://rust-random.github.io/book/portability.html. I suppose it could also be related to using a u64 in a JS environment (which is where I'm running my wasm).

Update: I tried RNGs that are marked as portable in the docs and I still get this problem, so it is probably something else, e.g. the u64 for the seed or the fact that I have to use usizes for gen_range.

dhardy commented 8 months ago

or the fact that I have to use usizes for gen_range

This is the most likely culprit. In fact, the only random part of your code is rng.gen_range(0..3usize), which reduces to <usize as SampleUniform>::sample_single_inclusive(0, 2, rng).

Given that we already have value-breaking changes here and that this is an obvious foot-gun, I'm tempted to change this to make it portable in rand v0.9 (see #1399).

banool commented 8 months ago

Yeah definitely feels like a footgun, I'd love the API to just not let me make this mistake. At the least a big warning in the docs for that method / any method with usize involved would be good.

vks commented 8 months ago

@banool Can you make it portable by avoiding to generate usize? You can just cast the length for small collections. Or better, use an utility function that checks the number fits into a u32.

@dhardy We have a workaround for this when shuffling (generating a u32 for collections with less than 2**32 elements even on 64-bit platforms). Maybe we can generalize/expose it?

dhardy commented 8 months ago

@vks we could expose that, but impls of usize are in general a portability hazard.

dhardy commented 8 months ago

@banool can you confirm this solves your issue? If so, close this issue (we already have another for portability of usize output, linked above).

banool commented 8 months ago

Well I suppose the original issue still stands, that the docs don't make it clear really which CSPRNGs are portable.

As for my specific problem, let me try that and see if it helps. I thought perhaps the problem was also that the input uses usizes as well (by virtue of gen range taking in a range that uses usize). But I'll find out!

newpavlov commented 8 months ago

All CSPRNs should be portable per se, i.e. calling the RngCore methods should produce the same result on all targets. You've encountered a portability hazard in a higher-level code which builds on top of the RngCore trait.

banool commented 8 months ago

Neato great, I can make a PR to amend the docs.

dhardy commented 4 months ago

Are you still intending to open that PR @banool?

banool commented 4 months ago

Sorry I got very busy at work, it's somewhere on my long to do list but no immediate plans.

dhardy commented 4 months ago

Closing as a duplicate of #1399 since this has nothing to do with the CSPRNG.