Open GregPlowman opened 6 years ago
See the discussion in https://github.com/JuliaLang/julia/issues/19578. Right now our pmapreduce
is hidden as @parallel (op) for
Thanks for that reference Andreas.
The problem for me with @parallel (op) for
is that it statically partitions the iterations across workers, whereas I want the dynamic load balancing of pmap
. I want pmapreduce
rather than pmap
, otherwise memory blows out.
Hopefully a new pmapreduce
will distribute work dynamically and minimize memory usage.
I guess for now, I'll try to use my own v0.5 pmapreduce
with v0.6. Let's see if it's compatible.
The following should do what you need (no batching and retrying though):
v = initial_val
for x in Base.AsyncGenerator(y->remotecall_fetch(foo_map_func, default_worker_pool(), y), collection_to_be_mapreduced; ntasks=nworkers())
v = foo_reduce_func(v, x)
end
Unexported AsyncGenerator
and pgenerate
should be used as the basis for a pmap_reduce
when implemented. Note that pgenerate
appears to be currently broken.
Are there any plans to add a pmapreduce
or a load-balancing variant of @distributed
to the stdlib?
I have need for a
pmap_reduce
function.In Julia v0.5 and earlier, I was able to customize the base
pmap
to provide what I needed. However since v0.6, parallel processing includingpmap
has been overhauled and largely rewritten. To a non-expert like me, it is much more complex and impenetrable, compared to the simpler implementation of earlier versions, which is still given as an example in the Parallel Computing section of the manual. https://docs.julialang.org/en/stable/manual/parallel-computing/#Scheduling-1Are there any issues with continuing in v0.6 and later to model my own
pmap_reduce
on the v0.5pmap
? What are the benefits of the new implementation ofpmap
?Alternatively, would you consider a feature request for an official
pmap_reduce
?