vitabaks / postgresql_cluster

PostgreSQL High-Availability Cluster (based on Patroni). Automating with Ansible.
https://postgresql-cluster.org
MIT License
1.7k stars 411 forks source link

TASK [patroni : Wait for port 8008 to become open on the host] #720

Closed warlockedward closed 2 months ago

warlockedward commented 3 months ago
TASK [patroni : Wait for port 8008 to become open on the host] ************************************************************************************************************
ok: [172.16.201.61]
FAILED - RETRYING: [172.16.201.61]: Check PostgreSQL is started and accepting connections on Master (1000 retries left).
FAILED - RETRYING: [172.16.201.61]: Check PostgreSQL is started and accepting connections on Master (999 retries left).
FAILED - RETRYING: [172.16.201.61]: Check PostgreSQL is started and accepting connections on Master (998 retries left).
FAILED - RETRYING: [172.16.201.61]: Check PostgreSQL is started and accepting connections on Master (997 retries left).
FAILED - RETRYING: [172.16.201.61]: Check PostgreSQL is started and accepting connections on Master (996 retries left).
FAILED - RETRYING: [172.16.201.61]: Check PostgreSQL is started and accepting connections on Master (995 retries left).
FAILED - RETRYING: [172.16.201.61]: Check PostgreSQL is started and accepting connections on Master (994 retries left).
FAILED - RETRYING: [172.16.201.61]: Check PostgreSQL is started and accepting connections on Master (993 retries left).
root@lax2.db1 etc]# systemctl status patroni.service
● patroni.service - Runners to orchestrate a high-availability PostgreSQL - Patroni
     Loaded: loaded (/etc/systemd/system/patroni.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2024-08-03 21:44:55 CST; 2s ago
    Process: 980451 ExecStartPre=/usr/bin/sudo /sbin/modprobe softdog (code=exited, status=0/SUCCESS)
    Process: 980453 ExecStartPre=/usr/bin/sudo /bin/chown postgres /dev/watchdog (code=exited, status=0/SUCCESS)
   Main PID: 980455 (patroni)
      Tasks: 4 (limit: 308925)
     Memory: 72.2M
        CPU: 1.648s
     CGroup: /system.slice/patroni.service
             └─980455 /usr/bin/python3 /usr/local/bin/patroni /etc/patroni/patroni.yml

Aug 03 21:44:57 pgnode01 patroni[980455]:     logger.info(self.ha.run_cycle())
Aug 03 21:44:57 pgnode01 patroni[980455]:   File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1980, in run_cycle
Aug 03 21:44:57 pgnode01 patroni[980455]:     info = self._run_cycle()
Aug 03 21:44:57 pgnode01 patroni[980455]:   File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1797, in _run_cycle
Aug 03 21:44:57 pgnode01 patroni[980455]:     return self.post_bootstrap()
Aug 03 21:44:57 pgnode01 patroni[980455]:   File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1681, in post_bootstrap
Aug 03 21:44:57 pgnode01 patroni[980455]:     self.cancel_initialization()
Aug 03 21:44:57 pgnode01 patroni[980455]:   File "/usr/local/lib/python3.10/dist-packages/patroni/ha.py", line 1674, in cancel_initialization
Aug 03 21:44:57 pgnode01 patroni[980455]:     raise PatroniFatalException('Failed to bootstrap cluster')
Aug 03 21:44:57 pgnode01 patroni[980455]: patroni.exceptions.PatroniFatalException: Failed to bootstrap cluster
[root@lax2.db1 etc]# systemctl status patroni.service
● patroni.service - Runners to orchestrate a high-availability PostgreSQL - Patroni
     Loaded: loaded (/etc/systemd/system/patroni.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2024-08-03 21:45:11 CST; 2s ago
    Process: 980825 ExecStartPre=/usr/bin/sudo /sbin/modprobe softdog (code=exited, status=0/SUCCESS)
    Process: 980827 ExecStartPre=/usr/bin/sudo /bin/chown postgres /dev/watchdog (code=exited, status=0/SUCCESS)
   Main PID: 980829 (patroni)
      Tasks: 6 (limit: 308925)
     Memory: 70.4M
        CPU: 1.599s
     CGroup: /system.slice/patroni.service
             └─980829 /usr/bin/python3 /usr/local/bin/patroni /etc/patroni/patroni.yml

Aug 03 21:45:13 pgnode01 patroni[980856]:     /usr/lib/postgresql/15/bin/pg_ctl -D /data/pgdata/postgresql/15/main -l logfile start
Aug 03 21:45:14 pgnode01 patroni[980829]: 2024-08-03 21:45:14,008 INFO: establishing a new patroni heartbeat connection to postgres
Aug 03 21:45:14 pgnode01 patroni[980829]: 2024-08-03 21:45:14,013 INFO: establishing a new patroni heartbeat connection to postgres
Aug 03 21:45:14 pgnode01 patroni[980829]: 2024-08-03 21:45:14,014 INFO: establishing a new patroni heartbeat connection to postgres
Aug 03 21:45:14 pgnode01 patroni[980884]: 2024-08-03 21:45:14 CST [980884-1]  FATAL:  could not access file "timescaledb": No such file or directory
Aug 03 21:45:14 pgnode01 patroni[980884]: 2024-08-03 21:45:14 CST [980884-2]  LOG:  database system is shut down
Aug 03 21:45:14 pgnode01 patroni[980829]: 2024-08-03 21:45:14,314 INFO: postmaster pid=980884
Aug 03 21:45:14 pgnode01 patroni[980885]: /var/run/postgresql:5432 - no response
Aug 03 21:45:14 pgnode01 patroni[980829]: 2024-08-03 21:45:14,322 INFO: removing initialize key after failed attempt to bootstrap the cluster
Aug 03 21:45:14 pgnode01 patroni[980829]: 2024-08-03 21:45:14,326 INFO: renaming data directory to /data/pgdata/postgresql/15/main.failed

The problem repeats itself over and over again

warlockedward commented 3 months ago

it not found timescaledb...... now it's working.

vitabaks commented 3 months ago

Simply install the timescaledb package, apparently you forgot to define the package in the postgresql_packagesvariable.

Or use ansible-playbook deploy_pgcluster.yml -e "enable_timescale=true"