JuliaLang / Distributed.jl

Create and control multiple Julia processes remotely for distributed computing. Ships as a Julia stdlib.
https://docs.julialang.org/en/v1/stdlib/Distributed/
MIT License
23 stars 9 forks source link

`pmap` still parallelizing after `rmprocs`? #72

Closed pazner closed 5 months ago

pazner commented 3 years ago

The first call to pmap after using rmprocs to remove all the workers still behaves as if it is running in parallel. The second call behaves normally:

julia> using Distributed

julia> addprocs(5);

julia> @time @sync pmap(i -> sleep(1), 1:10);
  2.711441 seconds (251.94 k allocations: 13.460 MiB)

julia> wait(rmprocs(workers()...))

julia> @time @sync pmap(i -> sleep(1), 1:10);
  2.111553 seconds (192.76 k allocations: 9.978 MiB)

julia> @time @sync pmap(i -> sleep(1), 1:10);
 10.135405 seconds (126.43 k allocations: 6.459 MiB, 0.12% gc time)

Version info:

Julia Version 1.5.0
Commit 96786e22cc (2020-08-01 23:44 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.7.0)
  CPU: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, haswell)
felixcremer commented 1 year ago

I think, that this is due to default_worker_poolnot being emptied from the rmprocs call. If we do a take! after the rmprocs, the default_worker_pool is empty and the pmap call runs sequentially as expected. Unfortunately I don't know what is happening in this call of take!.

julia> addprocs(5);

julia> @time @sync pmap(i -> sleep(1), 1:10);
  3.307411 seconds (115.06 k allocations: 5.953 MiB, 4.43% compilation time)

julia> default_worker_pool()
WorkerPool(Channel{Int64}(9223372036854775807), Set([35, 32, 36, 33, 34]), RemoteChannel{Channel{Any}}(1, 1, 28))

julia> wait(rmprocs(workers()...))

julia> default_worker_pool()
WorkerPool(Channel{Int64}(9223372036854775807), Set([35, 32, 36, 33, 34]), RemoteChannel{Channel{Any}}(1, 1, 28))

julia> take!(default_worker_pool())
1

julia> default_worker_pool()
WorkerPool(Channel{Int64}(9223372036854775807), Set{Int64}(), RemoteChannel{Channel{Any}}(1, 1, 28))

julia> @time @sync pmap(i -> sleep(1), 1:10);
 10.144675 seconds (72.51 k allocations: 3.716 MiB, 1.19% compilation time)
vtjnash commented 5 months ago

seems working now as expected

julia> @time @sync pmap(i -> sleep(1), 1:10);
  4.854871 seconds (2.19 M allocations: 110.307 MiB, 0.41% gc time, 26.33% compilation time)

julia> wait(rmprocs(workers()...))

julia> @time @sync pmap(i -> sleep(1), 1:10);
 10.106080 seconds (81.12 k allocations: 4.103 MiB, 0.83% compilation time)

julia> @time @sync pmap(i -> sleep(1), 1:10);
 10.106958 seconds (81.12 k allocations: 4.095 MiB, 0.83% compilation time)

even though default_worker_pool() still has out-of-date info before take!