FATAL: password authentication failed for user "standby"

IIvyPy commented 1 year ago

Please, answer some short questions which should help us to understand your problem / question better?

Which image of the operator are you using?
v1.8.2
Where do you run it - cloud or metal? Kubernetes or OpenShift? [AWS K8s | GCP ... | Bare Metal K8s]
Kubernetes
Are you running Postgres Operator in production? [yes | no]
no
Type of issue? [Bug report, question, feature request, etc.] question: I want to use the standby cluster so I just use the demo from manifests. First, I create a cluster like this: kubectl create -f manifests/minimal-postgres-manifest.yaml Second, I create a standby like this: kubectl create -f manifests/standby-manifest.yaml meet an error-log but I don't know what happened and how to fix it. Error log like this:

2022-11-10 13:37:57,484 INFO: Selected new K8s API server endpoint https://172.16.3.44:6443 2022-11-10 13:37:57,544 INFO: No PostgreSQL configuration items changed, nothing to reload. 2022-11-10 13:37:57,551 INFO: Lock owner: None; I am acid-standby-cluster-0 2022-11-10 13:37:57,680 INFO: trying to bootstrap a new standby leader pg_basebackup: error: connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: password authentication failed for user "standby" password retrieved from file "/run/postgresql/pgpass" connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: no pg_hba.conf entry for replication connection from host "192.168.234.99", user "standby", no encryption pg_basebackup: error: connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: password authentication failed for user "standby" password retrieved from file "/run/postgresql/pgpass" connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: no pg_hba.conf entry for replication connection from host "192.168.234.99", user "standby", no encryption 2022-11-10 13:38:08,070 INFO: Lock owner: None; I am acid-standby-cluster-0 2022-11-10 13:38:08,070 INFO: not healthy enough for leader race 2022-11-10 13:38:08,108 INFO: bootstrap_standby_leader in progress 2022-11-10 13:38:18,063 INFO: Lock owner: None; I am acid-standby-cluster-0 2022-11-10 13:38:18,063 INFO: not healthy enough for leader race 2022-11-10 13:38:18,064 INFO: bootstrap_standby_leader in progress pg_basebackup: error: connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: password authentication failed for user "standby" password retrieved from file "/run/postgresql/pgpass" connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: no pg_hba.conf entry for replication connection from host "192.168.234.99", user "standby", no encryption 2022-11-10 13:38:28,052 ERROR: Error creating replica using method basebackup_fast_xlog: /scripts/basebackup.sh exited with code=1 2022-11-10 13:38:28,052 ERROR: failed to bootstrap clone from remote master postgresql://acid-minimal-cluster.default:5432 2022-11-10 13:38:28,053 INFO: Removing data directory: /home/postgres/pgdata/pgroot/data 2022-11-10 13:38:28,065 INFO: Lock owner: None; I am acid-standby-cluster-0 2022-11-10 13:38:28,065 INFO: not healthy enough for leader race 2022-11-10 13:38:28,143 INFO: bootstrap_standby_leader in progress 2022-11-10 13:38:38,073 INFO: removing initialize key after failed attempt to bootstrap the cluster Traceback (most recent call last): File "/usr/local/bin/patroni", line 11, in <module> sys.exit(main()) File "/usr/local/lib/python3.6/dist-packages/patroni/__main__.py", line 143, in main return patroni_main() File "/usr/local/lib/python3.6/dist-packages/patroni/__main__.py", line 135, in patroni_main abstract_main(Patroni, schema) File "/usr/local/lib/python3.6/dist-packages/patroni/daemon.py", line 100, in abstract_main controller.run() File "/usr/local/lib/python3.6/dist-packages/patroni/__main__.py", line 105, in run super(Patroni, self).run() File "/usr/local/lib/python3.6/dist-packages/patroni/daemon.py", line 59, in run self._run_cycle() File "/usr/local/lib/python3.6/dist-packages/patroni/__main__.py", line 108, in _run_cycle logger.info(self.ha.run_cycle()) File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1514, in run_cycle info = self._run_cycle() File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1388, in _run_cycle return self.post_bootstrap() File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1280, in post_bootstrap self.cancel_initialization() File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1273, in cancel_initialization raise PatroniFatalException('Failed to bootstrap cluster') patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster' /etc/runit/runsvdir/default/patroni: finished with code=1 signal=0 /etc/runit/runsvdir/default/patroni: sleeping 30 seconds

thank you for your response

IIvyPy commented 1 year ago

I found the error. It is because the password of user-standy is different with the password of source user-standy. And I want to know hot to set the password same for all users.

jawabuu commented 1 year ago

You need to define a secret to set a password

apiVersion: v1
stringData:
  password: ${source_password}
  username: standby
kind: Secret
metadata:
  labels:
    application: spilo
    cluster-name: ${db_cluster}
    team: audienti
  name: standby.${db_cluster}.credentials.postgresql.acid.zalan.do
  namespace: postgres

Danieloni1 commented 2 months ago

This is still happening to me on v1.12.2. As it is a FATAL error, it allegedly fails my cluster consistently every time I deploy (bare metal). Any tips?

$ kubectl exec -it -n postgres-operator postgres-cluster-0 -- patronictl show-config
failsafe_mode: false
loop_wait: 10
maximum_lag_on_failover: 33554432
postgresql:
  parameters:
    archive_mode: 'on'
    archive_timeout: 1800s
    autovacuum_analyze_scale_factor: 0.02
    autovacuum_max_workers: 5
    autovacuum_vacuum_scale_factor: 0.05
    checkpoint_completion_target: 0.9
    hot_standby: 'on'
    log_autovacuum_min_duration: 0
    log_checkpoints: 'on'
    log_connections: 'on'
    log_disconnections: 'on'
    log_line_prefix: '%t [%p]: [%l-1] %c %x %d %u %a %h '
    log_lock_waits: 'on'
    log_min_duration_statement: 500
    log_statement: ddl
    log_temp_files: 0
    max_connections: '640'
    max_replication_slots: 10
    max_wal_senders: 10
    tcp_keepalives_idle: 900
    tcp_keepalives_interval: 100
    track_functions: all
    wal_compression: 'on'
    wal_level: hot_standby
    wal_log_hints: 'on'
  use_pg_rewind: true
  use_slots: true
retry_timeout: 10
ttl: 30

And logs:

# kubectl logs -n postgres-operator postgres-cluster-2 | grep FATAL
psycopg2.OperationalError: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL:  role "standby" does not exist
psycopg2.OperationalError: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL:  role "standby" does not exist
psycopg2.OperationalError: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL:  role "standby" does not exist
...

zalando / postgres-operator

FATAL: password authentication failed for user "standby" #2104