JuliaLang / Distributed.jl

Create and control multiple Julia processes remotely for distributed computing. Ships as a Julia stdlib.
https://docs.julialang.org/en/v1/stdlib/Distributed/
MIT License
23 stars 9 forks source link

Order dependent module loading with `Distributed` #56

Open vchuravy opened 5 years ago

vchuravy commented 5 years ago

Discovered by @ararslan in his quest to get Nanosoldier back online.

In distributed mode using X should make X available on the worker nodes as a root module so that remotecall(X.f) works without a @everywhere X.

This correctly works:

julia> using Distributed

julia> addprocs(2)
2-element Array{Int64,1}:
 2
 3

julia> @everywhere begin
       function log_require(mod::Base.PkgId)
         @info "Loading pkg $mod $(Base.root_module_exists(mod)) on proc $(myid())"
       end
       push!(Base.package_callbacks, log_require)
       end

julia> using BenchmarkTools
[ Info: Loading pkg JSON [682c06a0-de6a-54ab-a142-c8b1cf79cde6] true on proc 1
[ Info: Loading pkg JSON [682c06a0-de6a-54ab-a142-c8b1cf79cde6] true on proc 2
[ Info: Loading pkg JSON [682c06a0-de6a-54ab-a142-c8b1cf79cde6] true on proc 3
[ Info: Loading pkg BenchmarkTools [6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf] true on proc 2
[ Info: Loading pkg BenchmarkTools [6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf] true on proc 3
[ Info: Loading pkg BenchmarkTools [6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf] true on proc 1

If the user says using X before addprocs we no longer trigger the callback on a subsequent using X, thereby triggering https://github.com/JuliaLang/julia/pull/28857#issuecomment-415590074

julia> using Distributed

julia> using BenchmarkTools

julia> addprocs(2)
2-element Array{Int64,1}:
 2
 3

julia> @everywhere begin
       function log_require(mod::Base.PkgId)
          @info "Loading pkg $mod $(Base.root_module_exists(mod)) on proc $(myid())"
      end
      push!(Base.package_callbacks, log_require)
      end

julia> using BenchmarkTools

julia> @everywhere using BenchmarkTools
[ Info: Loading pkg JSON [682c06a0-de6a-54ab-a142-c8b1cf79cde6] true on proc 2
[ Info: Loading pkg JSON [682c06a0-de6a-54ab-a142-c8b1cf79cde6] true on proc 3
[ Info: Loading pkg BenchmarkTools [6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf] true on proc 2
[ Info: Loading pkg BenchmarkTools [6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf] true on proc 3
affans commented 5 years ago

Is this related to https://github.com/JuliaLang/julia/issues/28781? It would be nice if it could fix that bug as well.

vchuravy commented 5 years ago

No that is orthogonal to this issue

timholy commented 5 years ago

I've sometimes wanted this. OTOH, a manual @everywhere using X specifies that all workers use X. But what if you don't want all workers knowing about X? For example, if I have n computational workers and 1 visualization worker, I don't really need/want to load Gtk.jl on all the workers. (This might be particularly relevant if the workers are headless and using Gtk throws an error without a display environment---I don't know if that's true, but just supposing.)