Closed mdavisJr closed 1 month ago
I could be wrong but don't feel that it is an illusion when I have test case that produces around the same results listed in the post when you run the test case above.
Even if I raise the loop_count from 10 to 1000, when min = 1 and max = 40 the count is:
Direct Count: 883/1000
When min = 1 and max = 80 it gets a little better and goes to
Direct Count: 633/1000
Assuming the numbers are perfectly random and independently sampled from each other, how much of close numbers do you expect? (This is not super trivial to calculate, but this number exists)
It is only a bug if you observe a different number with rand than what would theoretically be expected.
@mdavisJr Try a similar code with any other language/library, or better compute expected value using statistical theory.
ThreadRng
uses CSPRNG seeded with system randomness and you can see sampling implementation here (IIRC it contains statistically insignificant bias, but it should not be important in your case).
Lets see if I understand your problem exactly...
@dhardy I think your math is wrong. You calculate independent sample pairs, but we have 5 sampled values for which the value is calculated. It's the same mistake which results in the birthday paradox.
It would have helped if he would have included source for fn check_within
. If it's just checking that any two numbers are separated by no more than 3, then the result makes sense. You need to use almost half the available range just to avoid any "close" results (1, 5, 9, 13, 17).
It would have helped if he would have included source for
fn check_within
. If it's just checking that any two numbers are separated by no more than 3, then the result makes sense. You need to use almost half the available range just to avoid any "close" results (1, 5, 9, 13, 17).
He sorted before ;)
Yes, that sort
makes a huge difference: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=af19c837fa92b0f3c0e25efe30a49e8a
So I guess it is normal that if I get 5 random numbers between 1 and 40, 8 to 9 times out of 10 2 of those 5 numbers are going to be within 3? Is there anyway to prevent this behavior or is it just the way randomness works?
Is there anyway to prevent this behavior or is it just the way randomness works?
You might be interested in learning about quasi-random (or low discrepancy) sequences.
By the way love you guys' work and appreciate the quick feedback. Do you guys have any plans to add quasirandom sequences to this library or any plans on creating a new library that generates quasirandom sequences?
No. #182 was related.
The main problem is that our algorithms (Uniform
, Bernoulli
, shuffling, non-linear sampling, ...) assume that the underlying generator is uniform. If this assumption is dropped, it's hard to say what exactly would happen without case-by-case analysis (e.g. your 1..=40
range may no longer be uniformly sampled on aggregate, or distribution of results might appear much the same — you get very different results if you use x % 40
instead of x / ($ty::MAX / 40)
.
Feel free to make your own non-uniform generator (RngCore
) and plug it into rand's algorithms, but don't complain if the results are unexpected.
Summary
A clear and concise description of what the bug is. I don't think this is a bug but it kind of seems that a lot of the random numbers are clumped together. Like if I get 5 random numbers between 1 and 80 and I do a check to see if any of the random 5 numbers are within 3 of each other, it is mostly like going to be true.
What behavior is expected, and why?
I would expect the numbers to be more spaced out. Like maybe 2, 28, 34, 55, 79.
Code demonstrating the problem
As you can see when min = 1 and max = 80, 7/10 sets have numbers within 3 when min = 1 and max = 40, 9/10 sets have numbers within 3. I know that thread_rng()..gen_range uses Uniform...Is there another struct that will give me the behavior that I'm expecting.