Follow primary command not fixing postgresql.auto.conf

pgpool / pgpool2

This is the official mirror of git://git.postgresql.org/git/pgpool2.git. Note that this is just a *mirror* - we don't work with pull requests on github. Please subscribe to pgpool-hackers mailing list from our website and submit your patch to this mailing list.

https://www.pgpool.net

Other

346 stars 91 forks source link

Follow primary command not fixing postgresql.auto.conf #67

Open mirecekparecek opened 3 months ago

mirecekparecek commented 3 months ago

Hello,

I use postgresql 15.6 and pgpool 4.5.2 after changin the home directory and setting up passwordless ssh with failover and followprimary it works well except for the postgresql.auto.conf it uses the wrong ip of the two node cluster when I use pcp_promote_node --switchover

follow_primary_command = '/etc/pgpool-II/follow_primary.sh %d %h %p %D %m %H %M %P %r %R'

After successfully running the node is in waiting state unless I do show pool_nodes and after that it is in pgpool but the replication doesn't work because of the postgresql.auto.conf

Also I try to use pcp.conf but it works only for one user when i try to use pcp_* (any pcp command like pcp_node_info and so on) from different users it returns that the authentication failed even though there are multiple defined in the file pcp.conf and I defined all using this one-liner: echo 'user:'pg_md5 password >> /etc/pgpool-II/pcp.conf. How can it work for one user and not for others ?

Thank you very much for response

pengbo0328 commented 3 months ago

The sample scripts included in the Pgpool-II package don't use postgresql.auto.conf. Below is an example: https://www.pgpool.net/docs/latest/en/html/example-cluster.html

If you are using a custom script, you will need to modify it to configure the streaming replication settings.

mirecekparecek commented 3 months ago

I use the sample script only changing the home directory and user name and replication on the postgres level - streaming replication and wanted to do switchover with pcp_promote_node and it works but the cluster misses the wal sender and receiver which is the only thing I have to change manually and restart the standby. I have to change the postgresql.auto.conf on the new standby otherwise it has wrong ip adress in there and the replication doesn't work. But, the variables in the script get set the primary ip address of the new primary correctly. Which is why I don't understand why it ends up like that even more when it doesn't use the autoconf.

Thank you

pengbo0328 commented 3 months ago

@mirecekparecek

The sample script saves the settings for streaming replication in ${NODE_PGDATA}/myrecovery.conf. It doesn't writes the settings for streaming replication to postgresql.auto.conf.
I think you used pcp_promote_node --switchover -n <node_id> to perform switchover. To use this command, you need to set follow_primary_command.

https://www.pgpool.net/docs/latest/en/html/pcp-promote-node.html

mirecekparecek commented 3 months ago

Yes the follow_primary_command is set with the sample, the myrecovery.conf has set up primary_conninfo correctly but that's why I don't know why it puts wrong ip address to the postgresql.auto.conf and destroys the replication.

pengbo0328 commented 3 months ago

Yes the follow_primary_command is set with the sample, the myrecovery.conf has set up primary_conninfo correctly but that's why I don't know why it puts wrong ip address to the postgresql.auto.conf and destroys the replication.

Because the sample script doesn't use pg_basebackup -R option and doesn't use ALTER SYSTEM command, I think postgresql.auto.conf is modified by other software.

mirecekparecek commented 3 months ago

It's on test server where there is only a postgresql and pgpool for testing pgpool, nothing else.

pengbo0328 commented 3 months ago

Pgpool-II doesn't use postgresql.auto.conf for streaming replication. Did you configure streaming replication of PostgreSQL using postgresql.auto.conf by yourself during the initial setup?

mirecekparecek commented 3 months ago

I just changed it when I created the replication by pgbasebackup. I first set up streaming replication in postgres then put pgpool above it. Couldn't the issue lay in pg_rewind ?

pengbo0328 commented 3 months ago

I just changed it when I created the replication by pgbasebackup.

I think you mean that you created standby using pg_basebackup with -R option. -R option will automatically create primary_conninfo in postgresql.auto.conf.

Couldn't the issue lay in pg_rewind?

The follow_primary sample script uses pg_rewind. See more detail: https://git.postgresql.org/gitweb/?p=pgpool2.git;a=blob_plain;f=src/sample/scripts/follow_primary.sh.sample;hb=refs/heads/V4_5_STABLE

The problem is that the sample script doesn't use ALTER SYSTEM to change primary_conninfo in postgresql.auto.conf. If you want to use postgresql.auto.conf, you can customize the sample script to use ALTER SYSTEM to change primary_conninfo.

I first set up streaming replication in postgres then put pgpool above it.

Alternatively, you need to configure primary_conninfo in the same way as shown in the sample script when you first set up the standby server.

cat > myrecovery.conf << EOT
primary_conninfo = 'host=${NEW_PRIMARY_NODE_HOST} port=${NEW_PRIMARY_NODE_PORT} user=${REPLUSER} application_name=${NODE_HOST} passfile=''/var/lib/pgsql/.pgpass'''
recovery_target_timeline = 'latest'
primary_slot_name = '${REPL_SLOT_NAME}'
EOT

vi postgresql.conf
(add the line below)
include_if_exists = 'myrecovery.conf'

mirecekparecek commented 3 months ago

Yes. you're right with everything, the thing is, both postgresql.auto.conf are right before i do switchover, but then when I do switchover the old primary has wrong ip in there, even though the variables that you can see in log for example %d, %D are set up right and I don't understand why after the switchover it ends up with wrong one because that makes it so that the new standby(old primary) can't connect, unless i manually change it.

pengbo0328 commented 3 months ago

but then when I do switchover the old primary has wrong ip in there, even though the variables that you can see in log for example %d, %D are set up right and I don't understand why after the switchover it ends up with wrong one because that makes it so that the new standby(old primary) can't connect, unless i manually change it.

Because pgpool doesn't change postgresql.auto.conf. The old settings in postgresql.auto.conf will override the settings used by pgpool (e.g. %d, %D).

I suggest configuring primary_conninfo in the same way as shown in the sample script when you first set up the standby server.

cat > myrecovery.conf << EOT
primary_conninfo = 'host=${NEW_PRIMARY_NODE_HOST} port=${NEW_PRIMARY_NODE_PORT} user=${REPLUSER} application_name=${NODE_HOST} passfile=''/var/lib/pgsql/.pgpass'''
recovery_target_timeline = 'latest'
primary_slot_name = '${REPL_SLOT_NAME}'
EOT

vi postgresql.conf
(add the line below)
include_if_exists = 'myrecovery.conf'