hapostgres / pg_auto_failover

Postgres extension and service for automated failover and high-availability
Other
1.09k stars 114 forks source link

Detecting PID 0 when dropping a node #859

Closed JelteF closed 2 years ago

JelteF commented 2 years ago

A user on slack reported this behaviour:

 /usr/pgsql-13/bin/pg_autoctl drop node --pgdata /var/lib/pgsql/13/data --destroy
18:06:33 71415 INFO  Removing node with name "node_10" in formation "default" from the monitor
18:06:33 71415 INFO  An instance of pg_autoctl is running with PID 0, waiting for it to stop.
18:07:03 71415 INFO  Sending signal SIGQUIT to pg_autoctl process 0
Quit

Sending this signal to PID 0 seems clearly incorrect. It caused the node to be stuck in dropped state on the monitor.

Link to slack message for reference: https://citus-public.slack.com/archives/C0XRHT1KJ/p1640164156184000