ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
35.19k stars 2.57k forks source link

Xoroshiro128 boolean generation #9292

Closed leesongun closed 3 years ago

leesongun commented 3 years ago

Currently std.rand.Random.boolean generates random boolean by truncating to u1, but it is recommended to use most significant bit to extract boolean from Xoroshiro128+ generator because of its low linear complexity of low bits. [1]

[1] "We suggest to use a sign test to extract a random Boolean value, and right shifts to extract subsets of bits." (source)

jedisct1 commented 3 years ago

std.rand.Random is not tied to a specific algorithm.

So, rather than introducing a hack for the current algorithm, maybe solving the root cause by using Xoshiro++ or Xoshiro** would be better.

leesongun commented 3 years ago

I think significant bits are tested more thoroughly in most PRNG as they'll be either tested uniformly (as in matrix rank test) or tested as floating number(as in Runs test). So I guess it makes sense to change the bit extraction mechanism uniformly.

jedisct1 commented 3 years ago

Excluding Xoroshiro+, this is probably not going to make any difference in the current set of RNGs we have.

But lower bits also have a shorter period in LCGs, so unconditionally using the most significant bits may indeed not be a bad idea. This is going to be a breaking change, though.

jedisct1 commented 3 years ago

That being said, xoroshiro128+ is really designed for floating-point numbers. The short period of the lowest bits is affecting the Random.fill() function, which ironically, also makes our Random.float() implementation biased.

So, while it wouldn't hurt to change Random.bool() to use the top bit, I still think that we should switch to xoshiro128++ or xoshiro256++ for the general case.

jedisct1 commented 3 years ago

Feel free to assign this to me, I'll do it tomorrow.