Open Plazmaz opened 4 years ago
See https://github.com/golang/go/issues/21835. Perhaps this package should suggest using x/exp/rand.
For whoever might update the documentation: the default Source is a lagged Fibonacci generator with j=273, k=607.
To answer how it differs from Mersenne Twister, LFG is essentially a simple LFSR over elements in GF(2^64), which as #21835 mentions has a number of empirical and theoretical problems; MT is a linear recurrence over a polynomial in GF(2^19937) in its usual implementation with another linear transform applied to its outputs, giving it strong theoretical performance. (I can offer more details on the differences if desired later, but I have a final to take now.)
While I agree that math/rand
probably ought to recommend a different generator for many applications, I find it odd for a standard package to recommend something in x/exp
. How close is x/exp/rand
to being promoted to x/math/rand
or similar?
The proposal has not even been accepted yet. If it does get accepted, it will still be a while, as the process described in the proposal makes clear.
So I've done quite a bit of digging since originally opening this issue. Something I can't find an answer to is why the LCG used to seed the initial algorithm uses Schrage's method instead of 64-bit arithmetic. I'm wondering if that was an intentional decision or simply an artifact of the original systems this implementation was written for. It seems like using 32-bit logic has a few downsides:
If someone has more context or info on the decision behind that, that would be great. It's possible I would need to ask one of the original authors (Ken Thompson seems like the person who'd know, just looking at this comment https://groups.google.com/g/golang-nuts/c/RZ1G3_cxMcM/m/_7J7tnHhsU4J).
Seeds are truncated to 32 bits for use in the LCG (I am not positive that's why this was done, but it seems to be the case)
They are not quite truncated, as they are moduloed (is that a word?) with an odd number first. But yes, half the bits of entropy are thrown away.
Just a guess, but seedrand does division/modulo on int32
s, and maybe that was considered expensive if changed to int64
on 32-bit machines. Maybe that was a big deal ~10 years ago, but I don't think that's a big problem nowadays.
I mean, it's being explicitly cast to an int32. The effect is either truncation of the higher bits or modulus, they're the same thing in this context. Regardless, the rest of this code is littered with 64 bit logic. To seed the generator, for each value in the state vector, three values from seedrand are used to generate an int64, which is then XOR'd with the 64-bit value in the state vector, so I think there'd still be plenty of problems on a 32-bit machine. Anyways, I am genuinely curious as to the intent behind this, and I have my own theories for sure, but if someone is actually aware of why this was done, that would be helpful. Otherwise I will see if I can get into contact with people involved in the original implementation and find out!
Does this issue reproduce with the latest release?
Yes
What did you expect to see?
Details about the algorithm used, links to source material, context about why this algorithm was used and how it differs from other methods.
What did you see instead?
https://github.com/golang/go/blob/c2edcf4b1253fdebc13df8a25979904c3ef01c66/src/math/rand/rng.go#L7-L12 A single vague comment listing two names, without the title of sources used, the name of the algorithm used, or any details whatsoever. The names alone don't seem to be sufficient for finding the source material (at least based on a good amount of googling) It seems like I'm not the only person having this issue: https://www.seehuhn.de/blog/134.html
Could provide some more context on what algorithm is being used, why it was chosen, and how it differs from something like mersenne twister? I think it would be valuable knowledge, and a good addition to that documentation.