Absolucy / nanorand-rs

A tiny, fast, zero-dep library for random number generation
zlib License
223 stars 23 forks source link

Generated floats are not uniformly distributed #30

Open knutwalker opened 3 years ago

knutwalker commented 3 years ago

Hej 👋

The documentation does not say anything about the distribution of the values, though one would assume that primitive values ought to be uniformly distributed. This is not the case for f32 and f64 based on https://github.com/Absolucy/nanorand-rs/blob/5e4ea401102cb3e1a519fed84778bc110d792e56/nanorand/src/gen.rs#L97 and https://github.com/Absolucy/nanorand-rs/blob/5e4ea401102cb3e1a519fed84778bc110d792e56/nanorand/src/gen.rs#L103

Floats above a certain value lose prevision, so using, say, u32::MAX and u32::MAX - 42 as the numerator both yield 1.0, which leads to a bias of the generated floats towards 1.0. This playground illustrates it a bit more: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=465eb2dcf33851195dde9f3a6601ea22

If you want to fix that, a solution would be to use the "safe" value provided by <float>::MANTISSA_DIGITS as the max instead of <int>::MAX. I could ope a PR for that if you like.

Absolucy commented 3 years ago

Sure, feel free to open a PR!

Absolucy commented 2 years ago

I'm not too familar how to do this with <float>::MANTISSA_DIGITS. f64::MANTISSA_DIGITS is just... 53. Do you mean something like u64::MAX >> f64::MANTISSA_DIGITS?

jklott commented 1 year ago

I've been looking at this for a bit and did some testing on the matter. You can use this playground to dive into how I was testing: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=e46aa8e1706df69ff68fe5805c3779dc

The results are basically that for values between 1 and 127 (inclusive), the current method won't tell the difference between something like (u32::MAX - 0) as f32 / u32::MAX as f32 and (u32::MAX - 127) as f32 / u32::MAX as f32

The change that @knutwalker has talked about shows that it won't tell the difference only for values 1 and 2. That is, (U32_SAFE_MAX - 0) as f32 / U32_SAFE_MAX as f32 and (U32_SAFE_MAX - 2) as f32 / U32_SAFE_MAX as f32 evaluate the same.

So although I'm not sure exactly what change should be implemented to random, or if one is needed at all, but it appears to only happen in the case u32::MAX >= u32::random >= u32::MAX - 127

jklott commented 1 year ago

So although it does seem for 128 different numbers at the tail end of the distribution it blends together, from my testing it seems to have no noticeable effect on the distribution. I used the code below to test multiple times how the distribution ends up and I feel it is sufficient for a non-cryptographically secure random generator. If you have a PR to submit @knutwalker then I think that wouldn't harm anything, but just leaving this comment to say the issue isn't noticeable.

use std::vec;

use nanorand::WyRand;

use nanorand::Rng;

fn main() {
    let mut rng = WyRand::new();
    //println!("Random number: {}", rng.generate::<f64>());

    let mut distributions = vec![0, 0, 0, 0, 0, 0, 0, 0, 0, 0];

    for _ in 0..=99_999_999 {
        let num = rng.generate::<f32>();
        if num <= 0.1 {
            distributions[0] += 1;
        } else if num <= 0.2 {
            distributions[1] += 1;
        } else if num <= 0.3 {
            distributions[2] += 1;
        } else if num <= 0.4 {
            distributions[3] += 1;
        } else if num <= 0.5 {
            distributions[4] += 1;
        } else if num <= 0.6 {
            distributions[5] += 1;
        } else if num <= 0.7 {
            distributions[6] += 1;
        } else if num <= 0.8 {
            distributions[7] += 1;
        } else if num <= 0.9 {
            distributions[8] += 1;
        } else {
            distributions[9] += 1;
        }
    }

    let mut new_num: f64;
    let mut normal_dis: Vec<f64> = vec![];

    for element in distributions {
        new_num = (element as f64) / 99_999_999.0;
        normal_dis.push(new_num);
    }

    println!("{:?}", normal_dis);
}
[0.09994116099941161, 0.10000555100005551, 0.100042101000421, 0.09999625099996251, 0.0999446709994467, 0.10003830100038301, 0.10000667100006672, 0.0999799509997995, 0.1000539110005391, 0.09999144099991442]