Open etam opened 4 years ago
In case of Task API tasks, TaskComputerAdapter._handle_computation_results
calls send_results
or send_task_failed
on self._task_server
. Then calls self._finished_cb()
, which is Node._try_shutdown
. The problem is that methods on TaskServer
are sending messages (and in case of send_results
also resources) asynchronously. So the callback to Node._try_shutdown
breaks those asynchronous operations.
In case of old tasks, the problem is the same. Messages are queued in self.results_to_send
and Node._try_shutdown
kicks in.
Race condition where the shutdown occurs before message to requestor (e.g. RCT) gets sent out from the message queue. Idea for a fix: change the graceful shutdown condition check (
Client#_try_shutdown
) to occur periodically instead of being triggered when a task finishes. Retry sending messages from the queue before proceeding with shutdown.