Give additional time for initial lease when jobs are dispatched early
Simplify logic to assign ports for distributed jobs
Add barrier to ensure jobs in round r+1 cannot proceed with _done_callback until round r has completed
Remove self._current_dispatched_jobs and self._next_dispatched_jobs as these are redundant with self._current_worker_assignments and self._next_worker_assignments respectively
Add try-catch blocks around self._update_lease_callback and self._construct_command
Add try-catch block around cancelling the completion event for jobs with extended leases
Fix error in dispatcher with jobs that do not require a data directory
Remove steps files in dispatcher after job completes
Summary of changes:
r+1
cannot proceed with_done_callback
until roundr
has completedself._current_dispatched_jobs
andself._next_dispatched_jobs
as these are redundant withself._current_worker_assignments
andself._next_worker_assignments
respectivelyself._update_lease_callback
andself._construct_command