Closed jwiegley closed 1 year ago
This issue is split off from #2.
Raising an exception when calling wait
from a running job seems like a pessimal solution, since it makes it difficult to run a set of jobs when the dependency graph is not known ahead of time. Only raising when all slots are taken is even worse, since it means the error could happen nondeterministically.
My expectation is that any finite poset of jobs should eventually complete. This implies that a if a worker runs a job that calls wait
on another job in the same group, the worker should execute that job.
@depp, unfortunately that's not always correct, as the waited on Async
may be bound to run on a particular capability, one which is different from the waiters capability.
~The~ A correct solution would probably be more along the lines of temporarily increasing the number of concurrent tasks (or equivalently decreasing the number of running tasks) while the wait is in progress. This highlights a distinction not currently made in the library between
Because of this new distinction any solution would require some consideration.
This would require us to somehow figure out whether we're run from one of the tasks, otherwise that would spawn one more thread than requested by caller.
thanks, @l29ah
Pinging @spl
If
mapTasks
invokesmapTasks
, deadlock can occur. Scenario:mapTasks
, which enqueues all of its jobs immediately in the execution graph, starts executing the first N tasks. It also waits until all tasks are finished before returning to the caller.mapTasks
, it will enqueue M more tasks, and wait for them to complete as well. However, the innermapTasks
can starve because no execution slots may ever free up: all jobs are now blocked, waiting on future jobs, which are waiting on the blocked jobs to finish.Solution: If we notice a call to
wait
in a thread whosethreadId
matches that of a running job, raise an exception that there may be a deadlock scenario.We could either raise this exception always (which makes it easier for the developer to avoid this situation), or we could only raise it if, when
wait
is called, there are no available slots.