Open vancleve opened 1 year ago
Note that on the remote process all RPCs are executed on separate tasks.
You can observe the same behaviour locally:
julia> using Base.Threads: @spawn
julia> wait(@spawn Random.seed!(1234))
julia> fetch(@spawn rand())
0.9531220182290453
julia> wait(@spawn Random.seed!(1234))
julia> fetch(@spawn rand())
0.18986972383585565
This is expected behavior see https://docs.julialang.org/en/v1/stdlib/Random/
However, the default RNG is thread-safe as of Julia 1.3 (using a per-thread RNG up to version 1.6, and per-task thereafter).
Since version 1.6, Julia uses a per-task random number generator and thus the seeding it is only influencing the task itself and it's children.
If you want deterministic behavior you will need to use a non-default RNG. Do note that remote-process calls execute concurrently and thus you may need to protect that global RNG with a lock.
We should improve the documentation on this.
Got it! Thanks for the help!
I am doing the following as a workaround for this.
julia> using Distributed
julia> addprocs(1);
julia> @everywhere using Random
julia> @everywhere const rng_copy = copy(Random.default_rng());
julia> fetch(@spawnat 2 let
Random.seed!(1234)
copy!(getglobal(Main,:rng_copy), Random.default_rng())
nothing
end)
julia> fetch(@spawnat 2 let
copy!(Random.default_rng(), getglobal(Main,:rng_copy))
x = rand()
copy!(getglobal(Main,:rng_copy), Random.default_rng())
x
end)
0.32597672886359486
While this works on 1.9, I'm pretty sure this code may break in future releases because copy!
isn't the documented way to save or load the state of the TaskLocalRNG
from what I can tell.
A related issue in Pluto: https://github.com/fonsp/Pluto.jl/issues/2290
We do actually have tests that copy works like that, so it is unlikely it would change or break that example
In 1.10 a similar example is broken.
julia> using Distributed
julia> addprocs(1);
julia> @everywhere begin
using Random
myrand() = fetch(Threads.@spawn(rand()))
const rng_copy = copy(Random.default_rng())
end
julia> fetch(@spawnat 2 let
Random.seed!(1234)
copy!(getglobal(Main,:rng_copy), Random.default_rng())
nothing
end)
julia> fetch(@spawnat 2 let
copy!(Random.default_rng(), getglobal(Main,:rng_copy))
x = myrand()
copy!(getglobal(Main,:rng_copy), Random.default_rng())
x
end)
julia> fetch(@spawnat 2 let
copy!(Random.default_rng(), getglobal(Main,:rng_copy))
x = myrand()
copy!(getglobal(Main,:rng_copy), Random.default_rng())
x
end)
julia> Random.seed!(1234);
julia> myrand()
julia> myrand()
In 1.9 both remote and local calls return the same random sequence.
In 1.10.0-beta2 the remote call always returns the constant 0.47487231547644215, so in this case, using copy! seems to break the rng of any spawned tasks.
In 1.10.0-beta2 the remote call always returns the constant 0.47487231547644215, so in this case, using copy! seems to break the rng of any spawned tasks.
Yeah in 1.10 spawning a task doesn't update anymore (compared to 1.9) the current task's local RNG state (TaskLocalRNG()
), so what you are seeing is expected (calling myrand()
doesn't mutate Random.default_rng()
).
So that might be an issue with Test.guardseed
now then. That copy
seems to be supposed to work.
I'm having trouble setting the seed on worker tasks using
Distributed
. If I set the seed and then get a random number in one remote call, the result is correct. If I set the seed and then get the random number in two remote calls, the random number is different each time.MWE
Output: