Open cburman01 opened 5 years ago
Hi @cburman01 did you find a solution for this in the meantime? I've just created a forum topic about this because I thought obvious that the process would be restarted https://elixirforum.com/t/genserver-isnt-restarted-on-graceful-shutdown/20523 however it seems it's not only my problem
Seems like it needs to be manually handled https://github.com/bitwalker/swarm/pull/83 😕
I just added a PR adding a restart
parameter to Swarm.register_name
to the caller can decide what to do when the pid terminates gracefully (or not), here.
The parameter name and its values were copied from Supervisor
, and inherited about the same concepts, with the difference that :DOWN, :noconnection
always restarts the process (when node goes down, just as before). The previous behaviour can be maintained if you set :transient
(restart only if the node goes down abruptly).
Hello, I have observed a peculiar behavior and was hoping to get some guidance.
I have noticed there is a different handoff of processes between nodes from when I gracefully terminate the application running as a systemd service eg: systemctl stop verses when I just kill the beam pid eg: kill -9 .
Here is the debug info when I just kill the pid: Nov 14 09:18:32 deverlapp02
Here is the debug info when I just stop the systemd service on one of the nodes:
I am just creating a supervisor that will only allow pids to be registered on only 1 of the nodes in the cluster.. That works fine. The only problem I am having is when systemd gracefully terminates its like it runs different handoff code.
Here is the workers that actually get started. They are generic depending on DB data:
Here is the registry: