Closed sylvaticus closed 3 years ago
It seems that with this usage pattern (using @threads
), you would not get reproducible results, so it's not clear to me what the benefit there would be here to use StableRNGs
. Sure, ideally it would be better to have randjump
than not, but I don't know how to implement it off the top of my head, so I'm unlikely to do it myself.
Hello, thanks for the answer. Why would I not get reproducible results? In my tests with MersenneTwister(123)
(on this and on real-case random forest algorithm) I obtain the same result on each call.
It doesn't matter in which order the thread are executed, they each have a dedicated RNG with a well defined starting state and they write to different elements of the output array...
Oh right OK, I had forgotten that @threads
divides iterations equally among threads (statically).
For now as a workaround, I would suggest defining generateParallelRngs
for StableRNG
by return n
RNGS initialized with random seeds.
I am doing it, but I have a strange behaviour with StableRNGs:
using Random, StableRNGs, Statistics
function generateParallelRngs(rng::AbstractRNG, n::Integer)
seeds = [rand(rng,100:18446744073709551615) for i in 1:n] # some RNGs have issues with too small seed
rngs = [deepcopy(rng) for i in 1:n]
return Random.seed!.(rngs,seeds)
end
function myFunction(rng = Random.GLOBAL_RNG)
rngs = generateParallelRngs(rng,Threads.nthreads()) # make new copy instances
results = Array{Float64,1}(undef,30)
masterSeed = rand(rng,100:9999999999999)
Threads.@threads for i in 1:30
tsrng = rngs[Threads.threadid()] # Thread safe random number generator: one RNG per thread
Random.seed!(tsrng,masterSeed+i*10) # Here we let the seed depends on the i of the loop, not the thread: we get same results indipendently of the number of threads
results[i] = sum(rand(tsrng, 1:1000,100))
end
overallResult = mean(results)
return overallResult
end
myFunction(MersenneTwister(123))
myFunction(MersenneTwister(123))
myFunction(StableRNG(123))
myFunction(StableRNG(123))
Here I have the same results (as expected) when I use the same number of threads.
But if I change the number of threads (e.g one console with julia -t 2
vd an other one with julia -t 4
) I have that while the output using MersenneTwister(123)
remain constant also with different number of threads, the results from using StableRNG(123)
are different in the two terminals... is it expected ?
Just discovered that doing the master seeding before the generateParallelRngs solved the issue, but I still wonder why, and why the issue is only with the StableRNG…
EDIT: I'm an idiot. Of course rand
in generateParallelRngs
is called proportionally to the number of threads, so if I call rand again later, its output will depend on the number of threads. Now the question is the opposite: why the MersenneTwister
implementation didn't change...
Hello, for multithreaded models, I am using randjump source1 source2, but this is not defines for StableRNG.
What I understood is that randjump simulate, but without actually computing them, a large collection of random numbers so to change the state of the random seed generator.
I am then using in things like this :
However, this doesn't work, because
randjump
is non actually implemented for StableRNG:outerFunction(x,rng=StableRNG(123))