Open IIvyPy opened 1 year ago
I found the error. It is because the password of user-standy is different with the password of source user-standy. And I want to know hot to set the password same for all users.
You need to define a secret to set a password
apiVersion: v1
stringData:
password: ${source_password}
username: standby
kind: Secret
metadata:
labels:
application: spilo
cluster-name: ${db_cluster}
team: audienti
name: standby.${db_cluster}.credentials.postgresql.acid.zalan.do
namespace: postgres
This is still happening to me on v1.12.2. As it is a FATAL error, it allegedly fails my cluster consistently every time I deploy (bare metal). Any tips?
$ kubectl exec -it -n postgres-operator postgres-cluster-0 -- patronictl show-config
failsafe_mode: false
loop_wait: 10
maximum_lag_on_failover: 33554432
postgresql:
parameters:
archive_mode: 'on'
archive_timeout: 1800s
autovacuum_analyze_scale_factor: 0.02
autovacuum_max_workers: 5
autovacuum_vacuum_scale_factor: 0.05
checkpoint_completion_target: 0.9
hot_standby: 'on'
log_autovacuum_min_duration: 0
log_checkpoints: 'on'
log_connections: 'on'
log_disconnections: 'on'
log_line_prefix: '%t [%p]: [%l-1] %c %x %d %u %a %h '
log_lock_waits: 'on'
log_min_duration_statement: 500
log_statement: ddl
log_temp_files: 0
max_connections: '640'
max_replication_slots: 10
max_wal_senders: 10
tcp_keepalives_idle: 900
tcp_keepalives_interval: 100
track_functions: all
wal_compression: 'on'
wal_level: hot_standby
wal_log_hints: 'on'
use_pg_rewind: true
use_slots: true
retry_timeout: 10
ttl: 30
And logs:
# kubectl logs -n postgres-operator postgres-cluster-2 | grep FATAL
psycopg2.OperationalError: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL: role "standby" does not exist
psycopg2.OperationalError: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL: role "standby" does not exist
psycopg2.OperationalError: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL: role "standby" does not exist
...
Please, answer some short questions which should help us to understand your problem / question better?
2022-11-10 13:37:57,484 INFO: Selected new K8s API server endpoint https://172.16.3.44:6443 2022-11-10 13:37:57,544 INFO: No PostgreSQL configuration items changed, nothing to reload. 2022-11-10 13:37:57,551 INFO: Lock owner: None; I am acid-standby-cluster-0 2022-11-10 13:37:57,680 INFO: trying to bootstrap a new standby leader pg_basebackup: error: connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: password authentication failed for user "standby" password retrieved from file "/run/postgresql/pgpass" connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: no pg_hba.conf entry for replication connection from host "192.168.234.99", user "standby", no encryption pg_basebackup: error: connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: password authentication failed for user "standby" password retrieved from file "/run/postgresql/pgpass" connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: no pg_hba.conf entry for replication connection from host "192.168.234.99", user "standby", no encryption 2022-11-10 13:38:08,070 INFO: Lock owner: None; I am acid-standby-cluster-0 2022-11-10 13:38:08,070 INFO: not healthy enough for leader race 2022-11-10 13:38:08,108 INFO: bootstrap_standby_leader in progress 2022-11-10 13:38:18,063 INFO: Lock owner: None; I am acid-standby-cluster-0 2022-11-10 13:38:18,063 INFO: not healthy enough for leader race 2022-11-10 13:38:18,064 INFO: bootstrap_standby_leader in progress pg_basebackup: error: connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: password authentication failed for user "standby" password retrieved from file "/run/postgresql/pgpass" connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: no pg_hba.conf entry for replication connection from host "192.168.234.99", user "standby", no encryption 2022-11-10 13:38:28,052 ERROR: Error creating replica using method basebackup_fast_xlog: /scripts/basebackup.sh exited with code=1 2022-11-10 13:38:28,052 ERROR: failed to bootstrap clone from remote master postgresql://acid-minimal-cluster.default:5432 2022-11-10 13:38:28,053 INFO: Removing data directory: /home/postgres/pgdata/pgroot/data 2022-11-10 13:38:28,065 INFO: Lock owner: None; I am acid-standby-cluster-0 2022-11-10 13:38:28,065 INFO: not healthy enough for leader race 2022-11-10 13:38:28,143 INFO: bootstrap_standby_leader in progress 2022-11-10 13:38:38,073 INFO: removing initialize key after failed attempt to bootstrap the cluster Traceback (most recent call last): File "/usr/local/bin/patroni", line 11, in <module> sys.exit(main()) File "/usr/local/lib/python3.6/dist-packages/patroni/__main__.py", line 143, in main return patroni_main() File "/usr/local/lib/python3.6/dist-packages/patroni/__main__.py", line 135, in patroni_main abstract_main(Patroni, schema) File "/usr/local/lib/python3.6/dist-packages/patroni/daemon.py", line 100, in abstract_main controller.run() File "/usr/local/lib/python3.6/dist-packages/patroni/__main__.py", line 105, in run super(Patroni, self).run() File "/usr/local/lib/python3.6/dist-packages/patroni/daemon.py", line 59, in run self._run_cycle() File "/usr/local/lib/python3.6/dist-packages/patroni/__main__.py", line 108, in _run_cycle logger.info(self.ha.run_cycle()) File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1514, in run_cycle info = self._run_cycle() File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1388, in _run_cycle return self.post_bootstrap() File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1280, in post_bootstrap self.cancel_initialization() File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1273, in cancel_initialization raise PatroniFatalException('Failed to bootstrap cluster') patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster' /etc/runit/runsvdir/default/patroni: finished with code=1 signal=0 /etc/runit/runsvdir/default/patroni: sleeping 30 seconds
thank you for your response