JuliaRandom / RandomNumbers.jl

Random Number Generators for the Julia Language.
https://juliarandom.github.io/RandomNumbers.jl/stable
Other
97 stars 23 forks source link

Very cheap PRNG #69

Closed milankl closed 4 years ago

milankl commented 4 years ago

I was wondering whether there are plans to add to this package of family of very cheap PRNG. I, personally, would like to compare Xoroshiro to something that I just came across:

// Cheap PRNG from "Numerical Recipes in C", page 284.
static inline uint32_t cheap_generator () {
    static unsigned long idum;
    idum = 1664525L*idum + 1013904223L;

    return idum;
}

Sure, that's so simple that one could just implement that directly, but it might be nice to have it as part of this package. Some applications (chaotic PDEs for me) probably don't need any high quality PRNG, therefore testing it against the (presumably) cheapest possible PRNG would be great.

milankl commented 4 years ago

Following how you wrote some PRNGs in this package, I assume you would do something like

mutable struct RandQD32
    idum::UInt64
end

function rand(r::RandQD32)
    r.idum = 1664525*r.idum + 1013904223
    return r.idum % UInt32
end
milankl commented 4 years ago

On that note, I don't find any performance increase by using this super simple PRNG

julia> r = Xoroshiro128Plus()
Xoroshiro128Plus(0xe83d479a785bbebd, 0xab00a029f48bb2ff)

julia> @btime rand($r,UInt32)
  1.591 ns (0 allocations: 0 bytes)
0xbe6a2409

julia> @btime rand($r,UInt64)
  1.286 ns (0 allocations: 0 bytes)
0xa4d9ad7c4a31f97e

baseline, versus

mutable struct QuickDirty32
    idum::UInt64
end

julia> function Base.rand(r::QuickDirty32)
           r.idum = 1664525*r.idum + 1013904223
           return r.idum % UInt32
       end

julia> r = QuickDirty32(UInt64(12312312312414))
QuickDirty32(0x00000b32af00725e)

julia> @btime rand($r)
  1.278 ns (0 allocations: 0 bytes)
0x073890ec

even a pointer implementation is not faster

julia> ptri = Ref(UInt64(123))
Base.RefValue{UInt64}(0x000000000000007b)

julia> function Base.rand(r::Base.RefValue{UInt64})
           r[] = 1664525*r[] + 1013904223
           return r[] % UInt32
       end

julia> ptri
Base.RefValue{UInt64}(0x000000000000007b)

julia> @btime rand($ptri)
  1.277 ns (0 allocations: 0 bytes)
0x6bb40978

Any idea why there is no performance increase from Xoroshiro to this one here?

sunoru commented 4 years ago

Hi, I'm sorry for forgetting to get back to you! Thank you very much to suggest this.

Yes, this is a very common and basic PRNG, but it is not so useful in most cases. It is called a "linear congruential generator" (LCG), which has a very small state (requiring very little memory) but does not have a good performance on statistical tests. And it is so simple to implement. People can just build them in several lines if they intend to use this kind of PRNG. I don't think it is needed here in this package.

As for the performance results you showed... My guess is that the bitwise operations are much faster than a multiplication operation.

milankl commented 4 years ago

Yes, I see this point. For an application like a system of chaotic PDEs, I suspect that it is not important to have a high quality PNRG. Do you know of any low quality but much faster than, say, compared to the Xorshifts? Small state shouldn't be a problem as the period of the chaotic system (which with finite precision also eventually repeats) is probably huge anyway and therefore will mask any small state.

sunoru commented 4 years ago

I am still now sure if you should use such a PRNG. Do you mean that you want to use a small state PRNG (so that has a small period) on purpose?

If not, I would just suggest one should use Xoshiro everywhere because it has a great statistical quality while it also runs very fast.

milankl commented 4 years ago

Yes, but is there another family or PRNGs that trade-in some of the quality for even higher speeds compared to Xoroshiro? I basically want to test whether in my applications I could get away with lower quality, which might be an overall advantage if I get speed instead.

sunoru commented 4 years ago

I don't know, honestly. The xorshift family is the fastest one I have used.

milankl commented 4 years ago

Okay, that is good to know! I somehow assumed that a PRNG could be even faster when quality is not required.