bitwalker / swarm

Easy clustering, registration, and distribution of worker processes for Erlang/Elixir
MIT License
1.2k stars 103 forks source link

Handoff during node shutdown #67

Closed FabienHenon closed 6 years ago

FabienHenon commented 6 years ago

I encountered an issue when a handoff started when the destination node was killed.

Here is the gist of the logs from the alive node (there is nothing interesting in the killed node): https://gist.github.com/FabienHenon/84d985b370bc76826f46bbebf4ff7563

As you can see from the logs I have 4 processes. One of them is on the other node. Then I killed the other node (as we can see from the logs), and just after that there is a [error] Scheduler BEGIN_HANDOFF %{} log, which means a handoff started on my scheduler process (#PID<0.792.0>). A few lines below there is another log [error] Scheduler DIE %{} which means handle_info with :die has been called on my scheduler process. Finally, the process located on the killed node has been restarted in the alive node, but I have 1 missing process: the scheduler.

bitwalker commented 6 years ago

Could you retest with master?

FabienHenon commented 6 years ago

Thank you, I'll retest with master. However, this is an issue hard to reproduce. I'll reopen this issue if it happens again