Closed ralfkonrad closed 4 months ago
Thanks! Out of curiosity, how does it compare to our current default, InverseCumulativeRng<MersenneTwisterUniformRng, InverseCumulativeNormal>
?
That's interesting, @lballabio.
On pure Windows, the current default is slightly faster (4.4ns
to 5.4ns
), on WSL Ubuntu slower (4.8ns
to 2.8ns
)
Windows CXX compiler MSVC 19.40.33808.0
-----------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------------------------------------------
xoshiro256StarStarZigguratGaussianNext.next(); 5.39 ns 5.47 ns 100000000
xoshiro256StarStarBoxMullerGaussian.next(); 9.30 ns 9.21 ns 74666667
mersenneTwisterBoxMullerGaussian.next(); 11.7 ns 11.2 ns 56000000
inverseCumulativeRng.next(); 4.41 ns 4.35 ns 154482759
WSL Ubuntu 22.04 CXX compiler GNU 11.4.0
-----------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------------------------------------------
xoshiro256StarStarZigguratGaussianNext.next(); 2.82 ns 2.82 ns 245704472
xoshiro256StarStarBoxMullerGaussian.next(); 9.27 ns 9.27 ns 74894426
mersenneTwisterBoxMullerGaussian.next(); 11.3 ns 11.3 ns 61261372
inverseCumulativeRng.next(); 4.80 ns 4.80 ns 145683037
The benchmarks can be found here: https://github.com/ralfkonrad/ql_performance_testing, so you might also compare them on your MAC(?).
The default is slightly slower on my Mac as well:
-----------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------------------------------------------
xoshiro256StarStarZigguratGaussianNext.next(); 3.81 ns 3.81 ns 185110260
xoshiro256StarStarBoxMullerGaussian.next(); 7.54 ns 7.53 ns 92264298
mersenneTwisterBoxMullerGaussian.next(); 11.8 ns 11.8 ns 59168597
inverseCumulativeRng.next(); 4.30 ns 4.29 ns 163303394
Interesting, how differently this pure number crunching behaves on different architectures and compilers...
The improved Ziggurat method to generate normal random samples is significantly faster than BoxMuller. Therefore, it is e.g. the default generator in
rust-random
forStandardNormal
distributions.As the underlying RNG needs to provide
std::uint64_t nextInt64() const
random numbers, currently it only works in combination withXoshiro256StarStarUniformRng
.On my local machine I get the following benchmark values.
Windows CXX compiler MSVC 19.40.33808.0
approx. two times faster compared to BoxMuller with MersenneTwister:WSL Ubuntu 22.04 CXX compiler GNU 11.4.0
approx. four times faster compared to BoxMuller with MersenneTwister: