JuliaLang / Distributed.jl

Create and control multiple Julia processes remotely for distributed computing. Ships as a Julia stdlib.
https://docs.julialang.org/en/v1/stdlib/Distributed/
MIT License
20 stars 8 forks source link

[Distributed.jl] inconsistent serialization of closures over global vars #90

Open kleinschmidt opened 11 months ago

kleinschmidt commented 11 months ago

here's my MWE (tested on 1.9.2):

using Distributed, Serialization
y = 3
f = x -> x + y

worker = only(addprocs(1))
@everywhere worker using Serialization

fs = let
    io = IOBuffer()
    serialize(io, f)
    take!(io)
end;

# error: UndefVarError: `y` not defined
remotecall_fetch(worker, fs, 2) do fs, x
    f = deserialize(IOBuffer(fs))
    invokelatest(f, x)
end

# succeeds
remotecall_fetch(f, worker, 2)

# now succeeds
remotecall_fetch(worker, fs, 2) do fs, x
    f = deserialize(IOBuffer(fs))
    invokelatest(f, x)
end

I understand why the first invocation of my manually-deserialized function doesn't work: y is non-const in global scope and is not captured by f; it works as I'd hoped if I do

f = let
    y = 3
    x -> x + y
end

what's troubling me is that somehow when you remotecall f itself, y gets defined as a global on the worker, so that the second time I deserialize and invoke f on teh remote worker, it succeeds.

vtjnash commented 10 months ago

I am not quite sure the bug being reported here. The remotecall_fetch code extends the serialization code to support moving global variables between compute nodes. That is not part of the standard serialization definition.

kleinschmidt commented 10 months ago

The remotecall_fetch code extends the serialization code to support moving global variables between compute nodes. That is not part of the standard serialization definition

Yeah, once I dug more into the ClusterSerializer I saw that pretty quickly. At this point I think this is more of a documentation issue than anything else.