Open quinnj opened 6 years ago
Hmm, this sounds like the RemoteRef is being serialized using a ClusterSerializer. See https://github.com/JuliaLang/julia/pull/22836
I would expect the RRID to be 0 and for operations on the deserialized RemoteRef to fail.
Here's how this functions on non-ClusterSerializer Serializers on 0.6:
julia [invenia]> serialize(io, Future())
ERROR: ArgumentError: elements of IntSet must be between 1 and typemax(Int)
Stacktrace:
[1] _throw_intset_bounds_err() at ./intset.jl:64
[2] push! at ./intset.jl:68 [inlined]
[3] (::Base.Distributed.##133#134{Base.Distributed.RRID,Int64})() at ./distributed/remotecall.jl:249
[4] lock(::Base.Distributed.##133#134{Base.Distributed.RRID,Int64}, ::Base.Threads.RecursiveTatasLock) at ./lock.jl:101
[5] add_client at ./distributed/remotecall.jl:247 [inlined]
[6] send_add_client(::Future, ::Int64) at ./distributed/remotecall.jl:262
[7] serialize(::SerializationState{Base.AbstractIOBuffer{Array{UInt8,1}}}, ::Future, ::Bool) at ./distributed/remotecall.jl:281
[8] serialize(::Base.AbstractIOBuffer{Array{UInt8,1}}, ::Future) at ./serialize.jl:630
and 0.7:
julia> using Serialization
julia> io = IOBuffer()
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
julia> serialize(io, Future())
WARNING: Base.Future is deprecated: it has been moved to the standard library package `Distributed`.
Add `using Distributed` to your imports.
in module Main
julia> seekstart(io)
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=54, maxsize=Inf, ptr=1, mark=-1)
julia> deserialize(io)
Distributed.Future(0, 0, 0, nothing)
julia> fetch(ans)
ERROR: no process with id 0 exists
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] worker_from_id(::Distributed.ProcessGroup, ::Int64) at /Users/ericdavies/repos/juliamaster/usr/share/julia/stdlib/v0.7/Distributed/src/cluster.jl:913
[3] worker_from_id at /Users/ericdavies/repos/juliamaster/usr/share/julia/stdlib/v0.7/Distributed/src/cluster.jl:905 [inlined]
[4] #remotecall_fetch#152 at /Users/ericdavies/repos/juliamaster/usr/share/julia/stdlib/v0.7/Distributed/src/remotecall.jl:392 [inlined]
[5] remotecall_fetch at /Users/ericdavies/repos/juliamaster/usr/share/julia/stdlib/v0.7/Distributed/src/remotecall.jl:392 [inlined]
[6] call_on_owner at /Users/ericdavies/repos/juliamaster/usr/share/julia/stdlib/v0.7/Distributed/src/remotecall.jl:465 [inlined]
[7] fetch(::Distributed.Future) at /Users/ericdavies/repos/juliamaster/usr/share/julia/stdlib/v0.7/Distributed/src/remotecall.jl:497
[8] top-level scope at none:0
Currently remote refs (
RemoteChannel
,Future
) are a bit of a silent killer if used in precompiled module. Though our precompilation docs spell out in great detail the various constructors/patterns that should be avoided, nothing is mentioned of remote refs.Digging into the implementation, however, reveals that remote refs use a global counter for self-identifying. It seems there's also likely to be issues with the worker id of the remote ref, but I didn't personally run into that (in my use-case, I was always creating
RemoteChannel
s on pid 1 as "config variables" that all other worker processes could reference).The problem identifies itself by producing invalid results: e.g. the first
Future
on pid 1 created at runtime will probably match the exactRRID
of a precompiledRemoteChannel
and henceisready
,wait
,fetch
return results of the precompiledRemoteChannel
instead of theFuture
. Quite a lovely surprise!Obviously we want to avoid this situation of things seeming completely broken, so if that involves throwing explicit errors when precompiling a module w/ global
RemoteChannel
orFuture
variables, that seems safest to me. I'm not aware of all the precompilation magic that is available though in the case that we could actually make this work.Ultimately, I think something more than just docs here.