vitabaks / postgresql_cluster

PostgreSQL High-Availability Cluster (based on Patroni). Automating with Ansible.
https://postgresql-cluster.org
MIT License
1.69k stars 411 forks source link

Wait for port 8008 to become open on the host #448

Closed rrrru closed 1 year ago

rrrru commented 1 year ago
TASK [patroni : Wait for port 8008 to become open on the host] *************************************************************************************************************************************************************************************************************************************************
ok: [65.109.237.63]
FAILED - RETRYING: [65.109.237.63]: Check PostgreSQL is started and accepting connections on Master (1000 retries left).
FAILED - RETRYING: [65.109.237.63]: Check PostgreSQL is started and accepting connections on Master (999 retries left).
FAILED - RETRYING: [65.109.237.63]: Check PostgreSQL is started and accepting connections on Master (998 retries left).
FAILED - RETRYING: [65.109.237.63]: Check PostgreSQL is started and accepting connections on Master (997 retries left).
FAILED - RETRYING: [65.109.237.63]: Check PostgreSQL is started and accepting connections on Master (996 retries left).
FAILED - RETRYING: [65.109.237.63]: Check PostgreSQL is started and accepting connections on Master (995 retries left).
FAILED - RETRYING: [65.109.237.63]: Check PostgreSQL is started and accepting connections on Master (994 retries left).

journalctl -u patroni -f

Aug 21 19:10:14 pgd-0 patroni[103199]: 2023-08-21 19:10:14,250 INFO: Lock owner: None; I am pgd-0
Aug 21 19:10:14 pgd-0 patroni[103199]: 2023-08-21 19:10:14,277 INFO: waiting for leader to bootstrap
Aug 21 19:10:24 pgd-0 patroni[103199]: 2023-08-21 19:10:24,251 INFO: Lock owner: None; I am pgd-0
Aug 21 19:10:24 pgd-0 patroni[103199]: 2023-08-21 19:10:24,279 INFO: waiting for leader to bootstrap
Aug 21 19:10:34 pgd-0 patroni[103199]: 2023-08-21 19:10:34,251 INFO: Lock owner: None; I am pgd-0
Aug 21 19:10:34 pgd-0 patroni[103199]: 2023-08-21 19:10:34,278 INFO: waiting for leader to bootstrap
Aug 21 19:10:44 pgd-0 patroni[103199]: 2023-08-21 19:10:44,250 INFO: Lock owner: None; I am pgd-0
Aug 21 19:10:44 pgd-0 patroni[103199]: 2023-08-21 19:10:44,279 INFO: waiting for leader to bootstrap
cat /etc/consul/conf.d/service_postgres-cluster-master.json
{
  "service": {
    "name": "postgres-cluster",
    "id": "postgres-cluster-master",
    "port": 6432,
    "checks": [{"http": "http://65.109.237.63:8008/primary", "interval": "2s"}, {"args": ["systemctl", "status", "pgbouncer"], "interval": "5s"}],
    "tags": ["master", "primary"]
  }
}

vars/main.yml

pgbouncer_install: false
root@pgd-0:~# netstat -tupln
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.1:8500          0.0.0.0:*               LISTEN      2267/consul
tcp        0      0 65.109.237.63:8301      0.0.0.0:*               LISTEN      2267/consul
tcp        0      0 127.0.0.1:8600          0.0.0.0:*               LISTEN      2267/consul
tcp        0      0 0.0.0.0:53              0.0.0.0:*               LISTEN      2322/dnsmasq
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      721/sshd: /usr/sbin
tcp        0      0 65.109.237.63:8008      0.0.0.0:*               LISTEN      103199/python3
tcp6       0      0 :::53                   :::*                    LISTEN      2322/dnsmasq
tcp6       0      0 :::22                   :::*                    LISTEN      721/sshd: /usr/sbin
udp        0      0 0.0.0.0:53              0.0.0.0:*                           2322/dnsmasq
udp        0      0 0.0.0.0:68              0.0.0.0:*                           704/dhclient
udp        0      0 0.0.0.0:68              0.0.0.0:*                           580/dhclient
udp        0      0 65.109.237.63:8301      0.0.0.0:*                           2267/consul
udp        0      0 127.0.0.1:8600          0.0.0.0:*                           2267/consul
udp6       0      0 :::53                   :::*                                2322/dnsmasq

It is suspected that initialization does not work if pgbouncer is disabled

rrrru commented 1 year ago

Find comment

# comment out this check if pgbouncer_install: false
    - { args: ["systemctl", "status", "pgbouncer"], interval: "5s" }  
rrrru commented 1 year ago

Strange, but doesn't works with patroni_cluster_name: "postgres-cluster"

Clean envariment with command ansible-playbook -i inventory remove_cluster.yml -e remove_postgres=true -e remove_consul=true

Update patroni_cluster_name to patroni_cluster_name: "postgres-cluster-test" and it's role done sucessful

upd: After failed Playbook runs, there are still records in Consul. It is necessary to delete keys from the consul, then the run will be successful. systemctl start consul consul kv get -recurse # to get all keys for example: consul kv delete -recurse service/postgres-cluster

rrrru commented 1 year ago

Patroni