Closed dgryski closed 7 years ago
I took a look into this and made a benchmark with a golang xorshift library I found on github: https://github.com/lazybeaver/xorshift#description
Here is the benchmark I made: http://pastebin.com/wEMKSqPf Note that benchmark includes getting the random numbers into the 1-10000 range as per test requirements.
To run:
1) Go must be installed
2) put in file with _test
ending e.g. rand_test.go
, preferably in its own directory.
3) $ go test -bench=.
in directory, .
is regex for the test files to be run (so all in this dir).
Findings (with fluff edited out):
$ go test -bench=.
BenchmarkNewRandIntn 50000000 36.8 ns/op
BenchmarkNewXorShift64Star 200000000 6.61 ns/op
BenchmarkNewXorShift128Plus 300000000 4.47 ns/op
BenchmarkNewXorShift1024Star 200000000 7.61 ns/op
$ go test -bench=.
BenchmarkNewRandIntn 50000000 36.9 ns/op
BenchmarkNewXorShift64Star 200000000 6.59 ns/op
BenchmarkNewXorShift128Plus 300000000 4.44 ns/op
BenchmarkNewXorShift1024Star 200000000 7.61 ns/op
$ go test -bench=.
BenchmarkNewRandIntn 30000000 36.7 ns/op
BenchmarkNewXorShift64Star 200000000 6.60 ns/op
BenchmarkNewXorShift128Plus 300000000 4.46 ns/op
BenchmarkNewXorShift1024Star 200000000 7.63 ns/op
Looks to me like you are right about the slowdown on rand.Intn
. Of the three xorshift variants XorShift128plus
is the fastest algorithm. This makes sense as additions are faster than multiplication, which is what the linear functions that are applied after the xorshift step do. "plus/+" and "star/*" in the shift name denotes this.
I can go ahead and update the tests after some peeps have had a chance to look at this.
I wrote https://github.com/dgryski/trifles/blob/master/fastrand/fastrand.go that includes Intn to avoid modulo bias on bounded ranges.
The slowdown will be larger under concurrent code due to lock contention, so it's not just the speed of the RNG.
Did you measured? JSON test which doesn't uses Rand also slow somehow. I think this is not bottleneck of the benchmark.
I have not benchmarked this particular case, but have seen numerous times in the past where access to a single-threaded rand instance has caused performance degradation due to lock contention. I'm fine to ignore this issue until a blocking profile shows otherwise.
RNG is used only in Fortune test. Fortune test includes HTTP server, JSON encode and database query. Database query also has lock. I don't think lock for RNG can be bottleneck.
Looking at https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/Go/go/src/hello/hello.go it seems rand.Intn
is called in almost all the handlers to select random rows from the database.
I'm sorry, I was wrong. But plaintext and json doesn't use rand. All other tests uses db as many as rand. rand can be bottleneck if rand is slower than db.
Locks in db.Query()
(fastest case):
https://github.com/golang/go/blob/master/src/database/sql/sql.go#L666-L682
https://github.com/golang/go/blob/master/src/database/sql/sql.go#L750-L782
src.Int63()
(called while lock held):
https://github.com/golang/go/blob/master/src/math/rand/rng.go#L232-L246
Additionally, there is go-prefork
already.
I don't want to add complexity or dependency without significant improvement.
As I said, I'm fine to ignore this until it shows up in a profile.
It might be interesting to profile with the new lock-contention profile in 1.8: https://rakyll.org/mutexprofile/
Similar to the java issue #1152 , all calls to rand.Intn() are behind a single murex in go to prevent concurrent access to the RNG. These calls should be updated to use an xorshift RNG seeded with the current time in nanoseconds.
https://github.com/TechEmpower/FrameworkBenchmarks/search?q=rand.Intn&type=Code&utf8=✓