Open iamed2 opened 7 years ago
This would be a tricky situation to address if the RNG is being serialized/deserialized every time.
On 0.6, global const rng = MersenneTwister(3742)
would ensure consistency with/without additional workers.
The behaviour we're aiming for in our use case is actually to have the RNG copied each time, but that's outside this issue. You can replace the RNG with any local mutable for the purposes of this issue.
mutables are not synchronized across processes in any manner. There is no cluster wide shared state for mutables referenced in closures. A common data store external to the julia processes can address such a requirement. Or keeping data on one of the processes and workers doing an atomic update-and-fetch from the data store process.
But does that mean that mutating mutables captured with @spawn
has undefined behaviour and might or might not mutate the same variable in other @spawn
calls?
The docs currently say (emphasis mine):
Note that although parallel for loops look like serial for loops, their behavior is dramatically different. In particular, the iterations do not happen in a specified order, and writes to variables or arrays will not be globally visible since iterations run on different processes. Any variables used inside the parallel loop will be copied and broadcast to each process.
This is the only mention of the behaviour of captured mutables.
I think one of these should happen:
@spawn
call, even if the only worker is the host, or cached and shared among processes based on some trait such as:
remote_move(::Type{MyType}) = Copy()
remote_move(::Type{MyOtherType}) = Cache()
@spawn
call, even if the only worker is the host@spawn
and pmap
and documented in @spawn
and pmap
docstrings as well as in the Parallel Computing
section of the manualFor performance reasons remote calls executing locally short-circuit the entire serialization-deserialization cycle of the closure. I get your view about consistent results between nprocs=1 and nprocs>1 in absolutely all cases.
The easier implementation to ensure similar behavior in all cases is to have remote calls to myid()
go through a loopback connection. But at the cost of inefficiency in remote calls to the local process.
There are three cases here to compare different behaviours with the different parallel methods but I really just care about the
@fetch
/@spawn
case.When there are one or more additional processes,
rng
is copied. When there is only one process, it is not.The reason the
@parallel for
case fails both times is due to the use ofCachingPool
, which may also be used forpmap
in the future via https://github.com/JuliaLang/julia/issues/21946.Ideally I would like the behaviours to be consistent (though I don't need them to be deterministic).