math/rand/v2: revised API for math/rand

rsc commented 1 year ago

Based on earlier discussions in #60751, #26263, and #21835, as well as discussions with @robpike, I propose adding a new version of math/rand, imported as math/rand/v2, to the standard library. The full text is below but is nearly identical to what we discussed in #60751.

The most direct motivation for this proposal is to clean up math/rand and fix these many lingering problems, especially the use of an outdated generator, slow algorithms, and the unfortunate collision with crypto/rand.Read.

A more indirect but equally important motivation is to set an example for other v2 packages in the standard library. Creating math/rand/v2 will let us work out tooling issues (support for v2 packages in gopls, goimports, and so on) in a relatively rarely used package with fairly low stakes attached before moving on to more commonly used, higher-stakes packages like sync/v2 or encoding/json/v2.

This proposal establishes the pattern for v2 packages in the standard library, namely that they are subdirectories of the original package and that the starting point for the API is the original package, with each deviation clearly justified. In particular, v2 packages are not an opportunity to redesign something from scratch "just because". The existing APIs work well, and deviations should be carefully considered. A from-scratch new API is just as likely (probably more likely) to introduce new problems as it is to fix existing ones.

Once math/rand/v2 has shipped, we would tag and delete x/exp/rand. This would keep programs that already use x/exp/rand working (by reference to the tagged version or older versions of x/exp) but allow us to delete the code from the main branch of x/exp, making clear that development on it has ended.

A prototype of these math/rand/v2 changes can be found at https://go.dev/cl/502506 and the CLs below it in the stack.

Proposal

The math/rand/v2 API would use math/rand as a starting point and then make the following backwards incompatible changes:

Remove Rand.Read and top-level Read. It is almost always a mistake to pretend that a pseudo-random generator is a good source of arbitrarily long byte sequences. The kinds of simulations and non-deterministic algorithms for which math/rand is a good choice almost never need byte sequences. Read is the only common piece of API between math/rand and crypto/rand, and code should essentially always use crypto/rand.Read instead. (math/rand.Read and crypto/rand.Read are problematic because they have the the same signature; math/rand.Int and crypto/rand.Int also both exist, but with different signatures, meaning code never accidentally mistakes one for the other.)
Remove Source.Seed, Rand.Seed, and top-level Seed. Top-level Seed is deprecated as of Go 1.20. Source.Seed and Rand.Seed assume that the underlying source can be seeded by a single int64, which is only true of a limited number of sources. Specific source implementations can provide Seed methods with appropriate signatures, or none at all for generators that cannot be reseeded; the details of seeding do not belong in the general interface.

Note that removing top-level Seed means that the top-level functions like Int will always be randomly seeded rather than deterministically seeded. math/rand/v2 will not pay attention to the randautoseed GODEBUG setting that math/rand does; auto-seeding for top-level functions is the only mode. This means in turn that the specific PRNG algorithm used by the top-level functions is unspecified and can change from release to release without breaking any existing code.
Change the Source interface to have a single Uint64() uint64 method, replacing Int63() int64. The latter was overfitting to the original Mitchell & Reeds LFSR generator. Modern generators can provide uint64s, and uint64s have fewer special cases at call sites.
Remove Source64, which is unnecessary now that Source provides the Uint64 method.
Use a more straightforward implementation in Float32 and Float64. Taking Float64 as an example, it originally used float64(r.Int63()) / (1<<63), but this has the problem of occasionally rounding up to 1.0, which Float64 must not. We tried changing it to float64(r.Int63n(1<<53) / (1<<53), which avoids the rounding problem, but we decided it was a backwards compatibile change to break the value streams that rand.Rand generators, and instead added a retry loop for the rare 1.0 case. Now we can make the breaking change, which is simpler and faster.

Note that some people have observed that the simple division does not make use of all the possible float64 values in the range [0, 1). For example values like 1/(1<<54), 1/(1<<55), 3/(1<<55), and so on are not generated at all, both in today's math/rand and in this simpler algorithm. Only the values 0 and 1/(1<<53) are, not the ones in between. It is possible to introduce even more complex algorithms to spread out the low values more while preserving something that can be deemed a uniform distribution, but these algorithms seem like overkill. The simple division should continue to suffice.
(New since #60751.) Correct bias problems in ExpFloat64 and NormFloat64. @flyingmutant reports that these have detectable bias, perhaps due to a mistake in porting the Ziggurat algorithms from float32 to float64. We can take this opportunity to correct those.
Implement Rand.Perm in terms of Rand.Shuffle. Shuffle is a bit more efficient, and then we have only one implementation. It has been suggested elsewhere to remove Rand.Perm instead, but keeping it costs little (especially if implemented in terms of Shuffle) and avoids unnecessary churn for users.
Rename Intn, Int31, Int31n, Int63, Int64n to IntN, Int32, Int32N, Int64, Int64N. The 31 and 63 in the names are unnecessarily pedantic and confusing, and the capitalized N is more idiomatic for a second "word" in the name in Go. Note: The capital N's in the names are a change from #60751.
Add Uint32, Uint32N, Uint64, Uint64N, Uint, UintN, both as top-level functions and methods on Rand. At the least, we need Uint64 to provide access to the widest values that the Source can provide. But we may as well complete the set while we are here. Note: The capital N's in the names are a change from #60751.
(New since #60751) Add a generic top-level function N that is like Int64N or Uint64N but works for any integer type. In particular this allows using rand.N(1*time.Minute) to get a random duration in the range [0, 1*time.Minute).
Use Lemire's algorithm in N, IntN, UintN, and so on. Preliminary benchmarks show a 40% savings compared to v1 Int31n and a 75% savings compared to v1 Int63n. (Like with Float64, since this changes the values returned, it is a breaking change that can only be applied in math/rand/v2.)
Add a new Source implementation, PCG-DXSM, with this API:
```
 func NewPCG(seed1, seed2 uint64) *PCG
 type PCG struct { ... }
 func (p *PCG) Uint64() uint64
 func (p *PCG) Seed(seed1, seed2 uint64)
 func (p *PCG) MarshalBinary() ([]byte, error)
 func (p *PCG) UnmarshalBinary(data []byte) error
```
PCG is a simple, efficient algorithm with good statistical randomness properties. The DXSM variant was introduced by the author specifically to correct a rare, obscure shortcoming in the original (PCG-XSLRR) and is now the default generator in Numpy.

PCG is provided for people who are writing simulations and need a seeded, deterministic source of randomness suitable for most purposes. Note that PCG is not assumed in any code. In particular the top-level functions need not use PCG (and in the current prototype do not). If in the future we add another algorithm, it will sit alongside PCG as a true peer.

Note: MarshalBinary and UnmarshalBinary are new since #60751.
Remove the Mitchell & Reeds LFSR generator and NewSource. An alternative to removing it would be to give it a name like NewLFSR or NewMitchellReeds, but that would serve no purpose. The obvious purpose would be to provide the original math/rand source so that code that needs to reproduce the exact math/rand outputs can do so. But the other changes to Rand, in routines like Intn and Float64, change the derived values from Rand's methods for a given Source input stream. Since preserving the math/rand Source stream would not suffice to preserve the math/rand Rand stream, preserving the math/rand Source is not worthwhile. Programs that need exactly the math/rand value streams can continue to use that package; it will not be removed.
(New since #60751 and the original proposal in this issue) Add a new Source implementation, ChaCha8, with this API:
```
 func NewChaCha8(seed [32]byte) *ChaCha8
 type ChaCha8 struct { ... }
 func (c *ChaCha8) Uint64() uint64
 func (c *ChaCha8) Seed(seed [32]byte)
 func (c *ChaCha8) MarshalBinary() ([]byte, error)
 func (c *ChaCha8) UnmarshalBinary(data []byte) error
```
ChaCha8 is a cryptographically-strong random number generator derived from the ChaCha8 stream cipher. It provides security equivalent to ChaCha8 encryption.
(New since #60751 and the original proposal in this issue) Use a per-OS-thread ChaCha8 for the global random generator, both in math/rand/v2 and in math/rand (when unseeded). The current implementation uses a per-OS-thread wyrand, which was fine compared to the Mitchell/Reeds/Thompson generator but is not really great compared to PCG or ChaCha8. In particular, it is strange for the top-level randomness to be less studied and less robust than the two concrete implementations provided. Using ChaCha8 also gives the default top-level randomness real cryptographic strength, turning accidental use of math/rand where crypto/rand would be more appropriate into less of a security problem. Using ChaCha8 runs slower than using wyrand, but it is about as fast as the current sync.Mutex-locked generator in math/rand (~6ns per value on my 2019 Intel MacBook Pro).

Differences since previous discussion

The differences between this proposal and what we discussed in #60751 are:

Renamed Intn, Int64n, etc to IntN, Int64N, etc, capitalizing the N.
Added top-level generic rand.N function, which is like tbe current rand.Int64n or rand.Uint64n but works for any integer type.
Added MarshalBinary and UnmarshalBinary methods to the new PCG type.
Adjust the ExpFloat64 and NormFloat64 algorithms to remove bias. (Thanks to @zephyrtronium.)
Add ChaCha8 alongside PCG.
Use ChaCha8 for global generator.

These differences are also marked above.

Discussion Summary

Summarizing the discussion on #60751:

There was a suggestion to parameterize the top-level Intn, Uintn, Int64n, Uint64n etc functions. The single N function does that instead.
There was a suggestion to make PCG implement BinaryMarshaler and BinaryUnmarshaler, which has been accepted.
There was a suggestion to use more efficient algorithms for ExpFloat64 and NormFloat64, which has been accepted.
There was a suggestion to use more efficient but subtly biased fixed-point multiplication-based algorithms for Int64N and related functions. Careful benchmarks do not seem to bear out the benefits of these, so I've left them using a variant of Lemire's algorithm that is less efficient only in the rare cases where it is correcting the subtle bias.
There was a suggestion to use a more complex implementation of Float64 that spreads some of the probability of very small values around to even smaller values that would otherwise never be generated. Specifically, values betwen 0 and 1/2⁵³ are never generated, so the suggestion is to generate 0 and 1/2⁵³ with probability less than the current 1-in-2⁵³ and reuse some of that probability for values in between. This does not seem worth the cost for two reasons. First, the simple division algorithm produces numbers that have uniform gaps between them. That is, the gaps near 1 are the same size as the gaps near zero. That specific uniformity is lost if you get clever about creating smaller gaps near zero. Second, many times the result of Float32 or Float64 is scaled or translated or both (Float64()*X + Y). The +Y undoes all the work to create those gaps, and the *X scales the gaps so that in the new range it's still not true that every possible floating point value in the range will be generated. (This basic rationale is in the proposal text below as well, since it was included in the text used to begin the discussion at #60751.)
There was a suggestion to remove the indirection of the Source abstraction, but that abstraction is important for being able to substitute a different pseudo-random algorithm, especially in simulations that care about the algorithm. The suggestion was motivated by performance reasons, but I am confident any performance problems can be handled by devirtualization in the compiler.
There was a suggestion to use v2/math/rand instead, but generally the v2 seems like a "lower order bit" than the package path, and so it should come after the package path. That form would also break 'go test math...' or at least not include math/rand. Another, related problem with v2/math/rand is that at some point we may want to break the standard library into separate modules. For example a crypto module could provide crypto/.... If some crypto packages are v2, like crypto/tls/v2, then there is no problem with the module named crypto providing both crypto/tls and crypto/tls/v2. However, the module named crypto cannot provide v2/crypto/tls, because strings.HasPrefix(v2/crypto/tls, crypto/) == false. I don't know whether it's a good idea to break the standard library up this way or not, but I think we want to at least leave that door accessible in case we want to walk through it. v2/crypto/tls or v2/math/rand would block that door. It is worth noting that the standard library is not special in being allowed to introduce v2 subdirectories for specific packages. Any module can do this, and in particular v1 modules can do this.
There was a suggestion to leave the top-level Read removed but add back rand.Rand.Read. The need for this is unclear, so it seems worth leaving out for now.
There was a suggestion to provide access to a goroutine-safe implementation of Source. That is not a common need, and it can be implemented trivially as a wrapper around rand.Uint64.
There was a suggestion to rename the package from math/rand to math/pseudorand, to try to draw more of a distinction between math/rand and crypto/rand. I have not done this, for a few reasons. First, even crypto/rand can be pseudo-random, just in a cryptographically stronger way. Second, the names are long and not Go-like. Third, there is already an established name for this package, and it is math/rand. If you notice there are packages named math/rand and math/rand/v2, the relationship is clear. If you notice packages named math/rand and math/pseudorand, the relationship is much less clear. Just as we are conservative about changing established API from v1 to v2, we should be conservative about changing established names.
There was a hyperbolic suggestion that Go "would do a great service to its users by making it as hard as possible to use any RNG other than crypto/rand." We are not going to take that approach, but we've added ChaCha8 as a source you can construct explicitly, and we've also changed the top-level functions to use ChaCha8 underneath, although this is not a documented guarantee.

rsc commented 1 year ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

zephyrtronium commented 1 year ago

Adjust the ExpFloat64 and NormFloat64 algorithms to remove bias. (Not implemented yet.)

I have an example implementation of this in https://github.com/zephyrtronium/pgregory_rand/tree/master/misc/practrand which I used to verify that it does resolve the bias detected by practrand. My approach uses recalculated 1024-step ziggurats specifically built for 64-bit PRNGs to improve the rejection procedure efficiency, which gives about -6% ns/op versus the 128-step normal ziggurat and +4% ns/op versus the 256-step exponential one (possibly due to bounds checking? I can't investigate at the moment), at the cost of the tables being a collective 48 kB instead of 4.5 kB.

goos: linux
goarch: amd64
pkg: math/rand
cpu: 12th Gen Intel(R) Core(TM) i5-1235U
               │ /home/branden/old.bench │      /home/branden/new.bench       │
               │         sec/op          │   sec/op     vs base               │
NormFloat64-12               3.961n ± 2%   3.709n ± 0%  -6.35% (p=0.000 n=20)
ExpFloat64-12                3.801n ± 1%   3.974n ± 0%  +4.56% (p=0.000 n=20)
geomean                      3.880n        3.840n       -1.04%

An alternative implementation is to use the same tables but adjusting the input to avoid reusing random bits between the step selection and the final output, which would leave tables unchanged and should result in the same performance but still resolve the bias.

This is out of band for the proposal per se, but would it be appropriate for me to send in a CL based on 502506 implementing the new ziggurats? I don't know whether or how I can do that.

rsc commented 1 year ago

@zephyrtronium If you can send a CL with the new code that'd be great. We will not be able to submit it as is, but I can lift it from there and add it to the v2 stack. 48kB of table is a lot more than 4.5kB, so your alternative leaving the tables unchanged sounds like the right approach. Thanks!

joeshaw commented 1 year ago

Creating math/rand/v2 will let us work out tooling issues (support for v2 packages in gopls, goimports, and so on)

Typically packages are identified in code by the last portion of their import path. This would be undesirable for packages versioned this way, so is the subtext here that tools like goimports would recognize math/rand/v2 and write import statements with renamed identifiers (import rand "math/rand/v2")?

rsc commented 1 year ago

@joeshaw Tools like goimports already recognize math/rand/v2 and understand that it is 'rand' even without a renamed import. Such paths are common already for the root package of a module that is >=v2.