JuliaLang / Distributed.jl

Create and control multiple Julia processes remotely for distributed computing. Ships as a Julia stdlib.
https://docs.julialang.org/en/v1/stdlib/Distributed/
MIT License
23 stars 9 forks source link

Closures are availble automatically on parrellel workers, but normal functions are not #37

Open mbeltagy opened 7 years ago

mbeltagy commented 7 years ago

On Julia 0.5, I encountered the following strange behavior.

addprocs(4)
function adder_gen(n)
    function add(x)
        x+n
    end
end
f_closure=adder_gen(5)
fetch(@spawn f_closure(5))

Normally, I would expect the last line not to work. There was no @everywhere preceding the closure generation. If the this was done with a normal function, an error would be generated. I believe an explanation is in order. There is nothing in the manual that would suggest this behavior.

krcools commented 7 years ago

Thank you! Spent a couple of days this week building wonky code under the assumption I could not pass a closure (SO:Evalutaion context of @everywhere).

tpapastylianou commented 7 years ago

I'm not sure (as you also point out) if this is formal behaviour or "bug as feature", but the issue seems to be that lambdas get passed and evaluated seamlessly on the worker, whereas normal functions do not and need to be defined on the worker explicitly, which is the expected behaviour. The reason the closure works is because it's actually a lambda; you can confirm that from its signature. e.g.

julia> addprocs(1)             #=> 1-element Array{Int64,1}:  2  
julia> g = () -> myid()        #=> (::#1) (generic function with 1 method)  (note the :: )
julia> f() = myid()            #=> f (generic function with 1 method)  
julia> remotecall_fetch(g, 2)  #=> 2
julia> remotecall_fetch(f, 2)  #=> ERROR: On worker 2: UndefVarError: #f not defined

Clearly if this is undocumented it may change in a later release, but it would actually be a nice feature to have as a workaround. Then again, perhaps the fact that you can pass a closure and evaluate it locally on a worker shouldn't be so surprising, since in theory they are simple objects that happen to be callable, so all you're doing is sending over an object and then performing a simple operation on it (as opposed to a normal named function, where you're just passing the symbol and asking for the evaluation of that symbol to occur on the worker, assuming it is defined locally).

So I agree, it would be nice to know if this is an intentional feature that can be used without fear of deprecation, and it would be useful if a dev could confirm this is the case and request its addition to the manual in the parallel computing section, as this is indeed a very useful mechanism to have in mind.

EDIT: looking at the documentation for @spawn clears this up, a closure is definitely a feature of the language.

help?> @spawn
  @spawn

  Creates a closure around an expression and runs it on an automatically-chosen process, 
  returning  a Future to the result.

So perhaps the devs could simply add this information to the documentation for clarity.

amitmurthy commented 7 years ago

In the above example, the issue is https://github.com/JuliaLang/julia/issues/19000

The following will work as both the closure and named function are treated as local bindings.

let
    g = () -> myid() 
    f() = myid() 
    remotecall_fetch(g, 2) 
    remotecall_fetch(f, 2) 
end

While it ought to be documented as a "gotcha" currently, I think it needs a more general approach that addresses this as part of https://github.com/JuliaLang/julia/issues/11228