Closed DimCitus closed 3 years ago
For interactive testing and QA, use the following setup:
$ make NODES=3 NODES_PRIOS=50,50,0 cluster
Then you can play around putting node2 and node3 to maintenance, and then both of them together.
I got into an unrecoverable state with these steps:
make cluster -j20 TMUX_LAYOUT=tiled NODES=3
pg_autoctl enable maintenance --pgdata node2
pg_autoctl enable maintenance --pgdata node3
pg_autoctl enable maintenance --pgdata node1 --allow-failover
Cluster will then be stuck in this state:
Disabling maintenance on node1 does not work:
Disabling maintenance on node2 (or node3) also does not work, because it gets in this loop when you try:
State then stays like this:
I got into an unrecoverable state with these steps:
Now fixed with the following error. We might want to avoid the first WARNING, what do you think?
$ pg_autoctl enable maintenance --pgdata node1 --allow-failover
12:27:29 65490 WARN WARNING: Starting maintenance on node 1 "node1" (localhost:5501) will block writes on the primary node 1 "node1" (localhost:5501)
12:27:29 65490 WARN DETAIL: we now have 0 healthy node(s) left in the "secondary" state and formation "default" number-sync-standbys requires 1 sync standbys
12:27:29 65490 ERROR Monitor ERROR: Starting maintenance on node 1 "node1" (localhost:5501) in state "primary" is not currently possible
12:27:29 65490 ERROR Monitor DETAIL: there is currently 0 candidate nodes available
12:27:29 65490 ERROR Failed to start_maintenance of node 1 from the monitor
12:27:29 65490 FATAL Failed to enable maintenance of node 1 on the monitor, see above for details
Again I found a way to reach an unrecoverable state:
make cluster -j20 TMUX_LAYOUT=tiled NODES=3
pg_autoctl enable maintenance --pgdata node3
pg_autoctl enable maintenance --pgdata node1 --allow-failover
Cluster will then be stuck in this state:
Disabling maintenance on node1 does not work:
Disabling maintenance on node3 also does not work, because it gets in this loop when you try:
State then stays like this:
Again I found a way to reach an unrecoverable state:
make cluster -j20 TMUX_LAYOUT=tiled NODES=3 pg_autoctl enable maintenance --pgdata node3 pg_autoctl enable maintenance --pgdata node1 --allow-failover
This now can be unblocked by running pg_autoctl disable maintenance --pgdata node3
; but still fails when trying to disable maintenance on node1. I am looking at a transition from prepare_maintenance back to primary, if that makes sense.
I think this is good to merge. I ran into some more issues with wait_maintenance and opened a PR to fix some of those: #794 I don't want to block this PR on that though.
We used to disallow starting maintenance on a node in some cases, but it seems that the user should be able to decide about when they need to operate maintenance on their own nodes. After all, we don't stop Postgres when going to maintenance, so users may change their mind without impacting their service. A WARNING message is now displayed in some cases that were previously prevented.
Also, the transition from WAIT_MAINTENANCE to MAINTENANCE was failing since we improved the Group State Machine for the primary node, which would go from JOIN_PRIMARY to PRIMARY without waiting for the other nodes to reach their assigned state of WAIT_MAINTENANCE.