Concerns about stability of automatic restart of nodes feature

maksm90 commented 5 years ago

Submission type

Bug report
Request for enhancement (RFE)

Environment

Stolon version

0.14.0

Bug description

Some PostgreSQL parameters (max_connections, max_locks_per_transaction, etc.) require a special care to order of restarting nodes. This issue is described in doc. The current implementation of auto restart feature #568 doesn't take into account this issue and could lead to a problem.

There is another issue against auto restart feature. The PostgreSQL instance does so called shutdown checkpoint under normal stopping when all dirty buffers flush to disk. For such instances that incorporate mass changes inside buffer pool (e.g. with expanded checkpoints) there could be longstanding downtime when all nodes restart simultaneously. Therefore it makes sense to do explicit checkpoint before restarting nodes.

Steps to reproduce the problem

After decreasing max_connections from 100 to 50 the standby node cannot restart and fails with error:

LOG:  entering standby mode
FATAL:  hot standby is not possible because max_connections = 50 is a lower setting than on the master server (its value was 100)
LOG:  startup process (PID 21841) exited with exit code 1
LOG:  aborting startup due to startup process failure
LOG:  database system is shut down

Enhancement Description

As automatic node restart in cluster has subtle issues I propose to delegate this to user exposing command to explicit restart nodes stolonctl restartkeeper <keeper_name> as it's implemented in patroni. When all necessary infrastructure for auto restart will be ready return back option automaticPgRestart.

Any thoughts?

sgotti commented 5 years ago

The current implementation of auto restart feature #568 doesn't take into account this issue and could lead to a problem.

But when the master restart, the standby should then start correctly or am I missing something?

As automatic node restart in cluster has subtle issues I propose to delegate this to user exposing command to explicit restart nodes stolonctl restartkeeper <keeper_name>.

This is another feature and not blocking since user can already restart the "postgres" without the need of a stolon command by restarting the keeper (or a comment where I propose to add a signal handler to the keeper to just restart postgres here: https://github.com/sorintlab/stolon/issues/255#issuecomment-290813423).

There're some similar issues (like #88) but it was closed. I'm personally against adding such command for many reasons:

Users can already do this.
Stolon is a distributed system with a declarative configuration and the components doesn't expose any direct api (like http api). So the command to work, like any stolonctl command can talk only with the store and you have to find a way to signal the keeper to restart.

maksm90 commented 5 years ago

But when the master restart, the standby should then start correctly or am I missing something?

Your statement is correct when master node restarts earlier than standby one after increasing of problem parameter. But in reverse case (when standby restarts earlier then master after decreasing parameter) standby node will see the old value of parameter on master forever and cannot restart normally. It will be required to revert changes of that parameter manually.

There're some similar issues (like #88) but it was closed. Stolon is a distributed system with a declarative configuration and the components doesn't expose any direct api (like http api). So the command to work, like any stolonctl command can talk only with the store and you have to find a way to signal the keeper to restart.

OK, I have understood your idea. Then I assume the sentinel is responsible for orchestrating of applying changes from pgParameters in right order within the automatic restart feature.

maksm90 commented 5 years ago

Users can already do this.

IMO manual (or automatic from external script) postmaster killing looks like duct tape. Users have to has the more convenient way to restart PostgreSQL (not keeper) nodes.

Stolon is a distributed system with a declarative configuration and the components doesn't expose any direct api (like http api). So the command to work, like any stolonctl command can talk only with the store and you have to find a way to signal the keeper to restart.

I could propose to use some counter to specify the number of required restarts, smth like PgRestartCounter, inside Spec and State of keepers. If sentinel or user wants to restart the specific node it simply increments this counter in Spec. The keeper adjusts the same-name counter in State after successful restarting. I think this logic satisfies the overall declarative policy of configuration supporting.

aswinkarthik commented 4 years ago

But in reverse case (when standby restarts earlier then master after decreasing parameter) standby node will see the old value of parameter on master forever and cannot restart normally. It will be required to revert changes of that parameter manually.

Your statement about the standby node does not "see" the old value of parameter on master and get stuck there. The standby node's configuration is from its postgresql.conf.

From PostgreSQL docs, the key point is

If these parameters are not set high enough then the standby will refuse to start

The standby may fail restarts first. But then keeper will keep on restarting it again. However, this is going to happen atmost once. The moment the master has restarted with updated configurations, the standby restart will work.

maksm90 commented 4 years ago

Not exactly, the problem here that standby relies on config data from control file on startup stage to determine that critical to restart parameters satisfy invariant from doc:

For these parameters, the value on the standby must be equal to or greater than the value on the primary

This is described in postgres sources.

After decreasing the critical parameter on master side firstly, the new value is replicated to standby in the context of PARAMETER_CHANGE WAL entry and then it is applied to control file structure on standby side. Subsequent standby restart uses that updated value from control file to check invariant and procedure succeeds eventually.

But if standby applies decreased value early that this value is replicated from master and applied to control file, than it cannot further restart (that invariant violation is fatal to startup process). And if there not to revert to old value then the single workaround here is manually to fix that value in control file that looks like a big hack.

maksm90 commented 4 years ago

IMO the most reasonable way to handle this issue is to check invariant on critical updated parameters on keeper side (compare with corresponding values kept from local control file) before their applying and restarting standby.

jinguiy commented 3 years ago

I got this issue in my env. After I changed max_connection to a smaller value. The master is restarted and and the max_connection is updated. While the standby failed to be started. I can't find a way to work around it except change the max_connection back to a larger one. Any way to workaround this issue to change the max_connection to a smaller one?

maksm90 commented 3 years ago

Hi, @jinguiy ! Unfortunately there is no other solution here as to revert max_connection setting, turn off automaticRestart and restart nodes manually keeping the valid order. But now PostgreSQL community from its side tries to mitigate this behavior.

jinguiy commented 3 years ago

@maksm90 thanks. I tried deleting pods to restart the pod, this way takes time to restart the master, so the standby is promote to master, the stolon cluster failed to start by this way. The worked way is restart postgres in the pod in order.

sorintlab / stolon