About Reset Node ID and pg_auto_failover to initial state.

hapostgres / pg_auto_failover

Postgres extension and service for automated failover and high-availability

Other

1.07k stars 112 forks source link

About Reset Node ID and pg_auto_failover to initial state. #969

Closed curtis18 closed 1 year ago

curtis18 commented 1 year ago

I would like to reset Node ID and tried to clear everything in monitor and postgres node, but it is unsuccessful and the service cannot run with following error. How can I reset the node ID number or the initial state of pg_auto_failover? I delete the databases in both monitor and postgres nodes, but it still shows 56. Thank you.

bagafoot commented 1 year ago

connect to monitor pg database with psql, there is "pgautofailover.node" table. You can delete related node from table or run function "pgautofailover.remove_node(host, port)".

DimCitus commented 1 year ago

Please provide a clear set of steps that allow reproducing your problem, or at the very least provide the command line(s) you tried and their result. Also never share logs as images, as it makes them hard to read and impossible to process. Closing for lack of actionable information. If you need help, consider opening this issue again with some level of information.

curtis18 commented 1 year ago

connect to monitor pg database with psql, there is "pgautofailover.node" table. You can delete related node from table or run function "pgautofailover.remove_node(host, port)".

Thanks for your help @bagafoot. I tried to locate "pgautofailover.node" table, but I cannot locate where is "pgautofailover.node".

pg_auto_failover=# \dt Did not find any relations.

I finally purge, restart, and reinstall the pg_auto_failover in order to resolve the issue. I cannot reproduce the issue after resolving the issue @DimCitus. I believe it may occur when the service of pg_auto_failover was still running, but the folder is pg_data folder was removed and therefore the non-working service was keeping alive and was unable to be stopped.

bagafoot commented 1 year ago

pgautofailover.node it is schema name that contain node table, you should check \dt pgautofailover.*

curtis18 commented 1 year ago

Thank you @bagafoot. May I know is there any pg_autoctl function that can pause streaming replication on all secondary nodes? I know pg_wal_replay_pause() can do it, but it requires to run under psql and one by one.