Open raminammour opened 5 years ago
@andreasnoack Any thoughts on this?
It's a good observation and a pretty simple though not super pretty fix.
I'm wondering if we with the new multithreading can now just delegate all the scheduling to a separate task that won't block while the local work is being executed. I'd like to hear @vchuravy 's thoughts.
Looking at the code, the pattern
@sync for i in pids
@async remotecall_fetch(**do_work**,i,...)
is common (and natural). So this may happen anywhere where **do_work**
is heavy. I guess adding yield()
in the correct places would work...
Or, at construction of DArray
, by convention, have the id==myid()
be last and preserve the invariant, pid[i]
holds chunck i
.
Cheers!
I think we need to carefully go through Distributed.jl and look at whether we can start using @spawn
instead of @async
, and then do the same for DistributedArrays.jl
Won't be easy since a whole bunch of this code is based on cooperative tasking, and switching to parallelism will expose races.
I might be able to have a UROP look at this transition.
Fixes issue #206 , please see the issue description for an explanation of the fix.