JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.93k stars 5.49k forks source link

Deprecate `myid` and `nprocs` fully #25135

Closed vchuravy closed 6 years ago

vchuravy commented 6 years ago

After a brief discussion with @vtjnash it seems that the remaining uses of myid and nprocs can be factored out and these two function can move fully to Distributed.

amitmurthy commented 6 years ago

The reason I left myid and nprocs in Base is to help library writers detect if their code is running in distributed mode or not.

if nprocs() > 1
  # distributed mode - pmap, `@parallel for` are all available
  if myid() == 1
    foo_master_in_distmode()
  else
    foo_worker_in_dist_mode()
  end
else
  foo_non_dist_mode_do_something_else()
end 
vchuravy commented 6 years ago

We can relatively easily provide a distributed_mode() function or variable. https://github.com/JuliaLang/julia/blob/fa801bb2c83114228f69cd9db09c5cfafebca999/base/client.jl#L289 is essentially that check.

Alternatively that check correctly reads right something like Base.isbindingresolved(Main, :Distributed) && !(Base.isdeprecated(Main, :Distributed)) && invokelatest(getfield, Main, :Distributed) === Base.root_module(:Distributed) which is why I abandoned the approach in #25119 and every call to Base.myid needs to do that... (we could cache that lookup, but still...)

amitmurthy commented 6 years ago

distributed_mode() as implemented will not work for distributed mode started via addprocs

using Distributed
addprocs()
using FooLibrary
FooLibrary.foo()   # `foo` needs to know if it us running in distributed mode or not.

nprocs() is the correct check for it.

vchuravy commented 6 years ago

You are correct that distributed_mode() currently disregards dynamic addprocs().

Let me make sure that I understand your concern correctly, FooLibrary needs to know that it is running in a distributed mode, but can't have a import Distributed: nprocs? If the library is already aware of distributed programming I would expect that it does at some point do a using Distributed. Is there are scenario where that wouldn't be true?

vtjnash commented 6 years ago

@amitmurthy that's what the Base.package_callbacks are for (so you can set flags for when Distributed is loaded).

amitmurthy commented 6 years ago

@vtjnash Distributed is currently "loaded" by default in sysimg.jl (no bindings are put in Main though), so I am not sure if Base.package_callbacks is applicable here.

@vchuravy this is more future proofing at this point with the assumption that one of the drivers for isolating the distributed functionality in a module and moving into stdlib is to make it easy to support multiple "distributed" implementations. In this scenario, the library will not execute a using Distributed, but will expect the main driver script to setup a cluster and load an appropriate distributed functionality. The library will call a pmap if running in dist mode, which may be Distributed.pmap or FooDistributed.pmap (with the same function signature) as the case may be. And yes, FooDistributed will have to replace Base.myid and Base.nprocs with its own implementation.

vchuravy commented 6 years ago

But if the library is calling pmap or any other functionality from Distributed or FooDistributed it will have to import either package since pmap won't be available otherwise. So wither the library is oblivious of distributed computing and we can use dispatch on types to use a parallel map (like DistributedArrays) or it has to be aware which Distributed implementation it is going to be using.

My main goal in moving Distributed to stdlib was to no longer tie development efforts to Julia versions.

juliohm commented 6 years ago

Could you please give some rationale for why nprocs is being removed from Base? How this affects package writers? In the case it is indeed removed, could you please provide an alternative to the pattern @amitmurthy described? I really on it heavily, for example here: https://github.com/juliohm/GeoStats.jl/blob/master/src/solvers.jl#L30-L37

From a user's perspective, this is how I understood the mechanics of distributed parallelism: http://nbviewer.jupyter.org/github/juliohm/GeoStats.jl/blob/master/examples/ParallelSimulation.ipynb

Is it correct? Will it change?

vchuravy commented 6 years ago

All of the Distributed processing tools have been moved to a stdlib package. So the only thing that will change is that at some-point in GeoStats.jl you will need to say using Distributed. The only thing this issue is about is whether to also move myid and nprocs to Distributed as well, everything else has already moved.

vchuravy commented 6 years ago

@amitmurthy We could make myid and nprocs fully hookable, but that leads to fairly inefficient code.

myid_func = Ref{Any}(()->1)
myid() = invokelatest(myid_func[])::Int

Then Distributed can later on say: myid_func[] = Distributed.myid, but I really would advise against the use of invokelatest if we can avoid it.

In my books a using Distributed or import Distributed: myid, nprocs is not to much to ask of package writers.

amitmurthy commented 6 years ago

Right. I was under the impression that bindings created under Main are visible to loaded packages - this was the root of my misunderstanding.

No objections to moving myid and nprocs out of Base now.

JeffBezanson commented 6 years ago

Are there cases where you need to check nprocs even though you don't use other distributed functionality? It would be bad to require adding a dependency on Distributed in those cases if so.

juliohm commented 6 years ago

If Distributed is a large dependency, I agree with @JeffBezanson , it would be great to be able to check if nprocs() > 1 or nworkers() > 1 as a package writer without a heavy dependency. If it is a light dependency, which will stay light forever, I don't mind it at all.

StefanKarpinski commented 6 years ago

Why not just have an isdistributed() predicate in Base similar to isinteractive()? Checking the number of processes or workers is not really the question we want the answer to.

juliohm commented 6 years ago

Maybe isdistributed() and ismultiprocess()? As @vchuravy explained to me the other day, my use cases are usually with nworkers(): https://discourse.julialang.org/t/clarification-on-pmap-master-worker-model/7824

StefanKarpinski commented 6 years ago

We don't need a predicate for every yes-no question you might ever ask. The reason for isdistributed is to know if it's safe to do nworkers() > 1 or not.

amitmurthy commented 6 years ago

@StefanKarpinski isdistributed is effectively nprocs() > 1.

To answer Jeff's question, I can imaging a scenario where a library may opt to support different types of parallelism.

module Foo

if nprocs() > 1  # dist mode
  using DistributedArrays   # this in turn loads Distributed
elseif is_gpu_available()  # GPU parallelism
  using GPUArrays
else
  threading=true
end

function foo()
  if nprocs() > 1
     ....        # Use DArrays
  elseif is_gpu_available()
    ....   # Use GPUArrays
  else
    ....    # Use multithreading
  end
end

end