golemfactory / clay

Golem is creating a global market for computing power.
https://golem.network
GNU General Public License v3.0
2.91k stars 286 forks source link

Provider does not report subtask results after graceful shutdown #4981

Open etam opened 4 years ago

etam commented 4 years ago

Race condition where the shutdown occurs before message to requestor (e.g. RCT) gets sent out from the message queue. Idea for a fix: change the graceful shutdown condition check (Client#_try_shutdown) to occur periodically instead of being triggered when a task finishes. Retry sending messages from the queue before proceeding with shutdown.

etam commented 4 years ago

In case of Task API tasks, TaskComputerAdapter._handle_computation_results calls send_results or send_task_failed on self._task_server. Then calls self._finished_cb(), which is Node._try_shutdown. The problem is that methods on TaskServer are sending messages (and in case of send_results also resources) asynchronously. So the callback to Node._try_shutdown breaks those asynchronous operations.

etam commented 4 years ago

In case of old tasks, the problem is the same. Messages are queued in self.results_to_send and Node._try_shutdown kicks in.