pgpool / pgpool2

This is the official mirror of git://git.postgresql.org/git/pgpool2.git. Note that this is just a *mirror* - we don't work with pull requests on github. Please subscribe to pgpool-hackers mailing list from our website and submit your patch to this mailing list.
https://www.pgpool.net
Other
346 stars 91 forks source link

When using snapshot isolation doing pcp_recovery_node will execute recovery_2nd_stage and there is a promote instruction there #53

Open possebon opened 6 months ago

possebon commented 6 months ago

Issues with recovery_2nd_stage Execution in PostgreSQL 14 and pgpool 4.5.2

Configuration:

Objective: Use load balancing for all SELECT queries

Documentation Reference: pgpool Documentation

Issue Description: I am configuring PostgreSQL 14 with pgpool 4.5.2 to load balance all SELECT queries. Following the documentation, I encountered an issue where the recovery_2nd_stage fails to execute correctly.

Observed Behavior: While debugging the configuration and logs, I found the following error:

+ /usr/pgsql-14/bin/psql -h pgcluster01 -p 5433 -t -c 'SELECT datname FROM pg_database WHERE NOT datistemplate AND datallowconn' template1
+ read i
+ '[' postgres '!=' '' ']'
+ psql -h pgcluster01 -p 5433 -c 'SELECT setval(oid, nextval(oid)) FROM pg_class WHERE relkind = '\''S'\''' postgres
 setval
--------
(0 rows)

+ read i
+ '[' pgtestdb '!=' '' ']'
+ psql -h pgcluster01 -p 5433 -c 'SELECT setval(oid, nextval(oid)) FROM pg_class WHERE relkind = '\''S'\''' pgtestdb
Password for user postgres:
psql: error: connection to server at "pgcluster01" (10.28.141.21), port 5433 failed: fe_sendauth: no password supplied
+ read i
+ '[' 2 -ne 0 ']'
+ echo ERROR: recovery_2nd_stage: failed.
ERROR: recovery_2nd_stage: failed.
+ exit 1
2024-05-29 10:41:53.305 -03 [82771] ERROR:  pgpool_recovery failed
2024-05-29 10:41:53.305 -03 [82771] STATEMENT:  SELECT pgpool_recovery('recovery_2nd_stage', 'pgcluster02', '/var/lib/pgsql/14/data', '5433', 1, '5433', 'pgcluster01')
[root@pgcluster01 data]#

Diagnosis: I created the pgtestdb to test using pgbench, but this database was not listed in .pgpass. The documentation states that the postgres user only needs to connect to the postgres database (I also added template1 to resolve the original error).

When recovery_2nd_stage fails, it does not execute the following promotion command:

# Promote the target server to a primary server
ssh -T ${SSH_OPTIONS} ${POSTGRESQL_STARTUP_USER}@$DEST_NODE_HOST "
    $PGHOME/bin/pg_ctl -l /dev/null -w -D $DEST_NODE_PGDATA promote"

Workaround: To avoid further errors when user databases are present, I modified the .pgpass file to use * instead of specifying databases manually. This allowed pcp_recovery_node to run smoothly (as recovery_2nd_stage completes successfully), but no replication slot appears when checking with SELECT * FROM pg_stat_replication.

Logs:

The PostgreSQL logs for the pcp_recovery_node process show no errors, indicating that everything ran successfully.

Concerns:

Request for Clarification:

pengbo0328 commented 5 months ago

@possebon Sorry for the late response.

Diagnosis: I created the pgtestdb to test using pgbench, but this database was not listed in .pgpass. The documentation states that the postgres user only needs to connect to the postgres database (I also added template1 to resolve the original error).

If you use other users in your scripts, you need to add those users in .pgpass.

Request for Clarification:

Is there a configuration detail I might be missing?
How should data replication be managed in this scenario to maintain consistency across nodes?

In snapshot isolation mode, streaming replication is not used. Pgpool-II will replicate data to each node. (Pgpool-II executes the same queries on each PostgreSQL node.)

https://www.pgpool.net/docs/latest/en/html/runtime-config-running-mode.html

If you want to use streaming replication for data consistency, you can use Pgpool-II's streaming replication mode by setting backend_clustering_mode = 'streaming_replication'

Below is the configuration example: https://www.pgpool.net/docs/latest/en/html/example-cluster.html