hapostgres / pg_auto_failover

Postgres extension and service for automated failover and high-availability
Other
1.09k stars 114 forks source link

Broken packaging of postgresql-14-auto-failover-1.6 on Ubuntu 18.04 #876

Closed s4ke closed 2 years ago

s4ke commented 2 years ago

When trying tests for our ansible scripts we stumbled across this failure:

fatal: [monitor]: FAILED! => {"changed": true, "cmd": "PATH=\"$PATH:/usr/lib/postgresql/14/bin\" pg_autoctl create monitor --pgdata \"/var/lib/postgresql/14/main_cluster\" --skip-pg-hba --ssl-ca-file \"/data/ansible/certs/postgres_server/rootCA.crt\" --server-key \"/data/ansible/certs/postgres_server/server.key\" --server-cert \"/data/ansible/certs/postgres_server/server.crt\" --hostname \"10.0.0.10\" --pgport \"5433\"", "delta": "0:01:01.247335", "end": "2022-03-25 17:21:09.052898", "msg": "non-zero return code", "rc": 12, "start": "2022-03-25 17:20:07.805563", "stderr": "17:20:07 17917 INFO  Using default --ssl-mode \"verify-full\"
17:20:07 17917 INFO  Initialising a PostgreSQL cluster at \"/var/lib/postgresql/14/main_cluster\"
17:20:07 17917 INFO  /usr/lib/postgresql/14/bin/pg_ctl initdb -s -D /var/lib/postgresql/14/main_cluster --option '--auth=trust'
17:20:08 17917 INFO  Started pg_autoctl postgres service with pid 17936
17:20:08 17936 INFO   /usr/bin/pg_autoctl do service postgres --pgdata /var/lib/postgresql/14/main_cluster -v
17:20:08 17917 INFO  Started pg_autoctl monitor-init service with pid 17937
17:20:08 17941 INFO   /usr/lib/postgresql/14/bin/postgres -D /var/lib/postgresql/14/main_cluster -p 5433 -h *
17:20:18 17936 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory
17:20:18 17936 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?
17:20:18 17936 ERROR Failed to get Postgres pid, see above for details
17:20:18 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/startup.log\":
17:20:18 17936 ERROR 2022-03-25 17:20:08.612 UTC [17941] LOG:  redirecting log output to logging collector process
17:20:18 17936 ERROR 2022-03-25 17:20:08.612 UTC [17941] HINT:  Future log output will appear in directory \"log\".
17:20:18 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/log/postgresql-2022-03-25_172008.log\":
17:20:18 17936 ERROR 2022-03-25 17:20:08.612 UTC [17941] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
17:20:18 17936 ERROR 2022-03-25 17:20:08.612 UTC [17941] LOG:  could not bind IPv4 address \"0.0.0.0\": Address already in use
17:20:18 17936 ERROR 2022-03-25 17:20:08.612 UTC [17941] HINT:  Is another postmaster already running on port 5433? If not, wait a few seconds and retry.
17:20:18 17936 ERROR 2022-03-25 17:20:08.612 UTC [17941] LOG:  listening on IPv6 address \"::\", port 5433
17:20:18 17936 FATAL 2022-03-25 17:20:08.618 UTC [17941] FATAL:  lock file \"/var/run/postgresql/.s.PGSQL.5433.lock\" already exists
17:20:18 17936 ERROR 2022-03-25 17:20:08.618 UTC [17941] HINT:  Is another postmaster (PID 17083) using socket file \"/var/run/postgresql/.s.PGSQL.5433\"?
17:20:18 17936 ERROR 2022-03-25 17:20:08.619 UTC [17941] LOG:  database system is shut down
17:20:18 17936 WARN  Failed to start Postgres instance at \"/var/lib/postgresql/14/main_cluster\"
17:20:18 17937 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory
17:20:18 17937 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?
17:20:18 17937 ERROR Failed to get Postgres pid, see above for details
17:20:18 17937 ERROR Failed to ensure that Postgres is running in \"/var/lib/postgresql/14/main_cluster\"
17:20:18 17937 ERROR Failed to install pg_auto_failover in the monitor's Postgres database, see above for details
17:20:18 17944 INFO   /usr/lib/postgresql/14/bin/postgres -D /var/lib/postgresql/14/main_cluster -p 5433 -h *
17:20:18 17917 ERROR pg_autoctl service monitor-init exited with exit status 12
17:20:18 17917 INFO  Restarting service monitor-init
17:20:28 17936 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory
17:20:28 17945 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory
17:20:28 17936 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?
17:20:28 17945 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?
17:20:28 17936 ERROR Failed to get Postgres pid, see above for details
17:20:28 17945 ERROR Failed to get Postgres pid, see above for details
17:20:28 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/startup.log\":
17:20:28 17945 ERROR Failed to ensure that Postgres is running in \"/var/lib/postgresql/14/main_cluster\"
17:20:28 17936 ERROR 2022-03-25 17:20:18.259 UTC [17944] LOG:  redirecting log output to logging collector process
17:20:28 17945 ERROR Failed to install pg_auto_failover in the monitor's Postgres database, see above for details
17:20:28 17936 ERROR 2022-03-25 17:20:18.259 UTC [17944] HINT:  Future log output will appear in directory \"log\".
17:20:28 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/log/postgresql-2022-03-25_172018.log\":
17:20:28 17936 ERROR 2022-03-25 17:20:18.259 UTC [17944] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
17:20:28 17936 ERROR 2022-03-25 17:20:18.260 UTC [17944] LOG:  could not bind IPv4 address \"0.0.0.0\": Address already in use
17:20:28 17936 ERROR 2022-03-25 17:20:18.260 UTC [17944] HINT:  Is another postmaster already running on port 5433? If not, wait a few seconds and retry.
17:20:28 17936 ERROR 2022-03-25 17:20:18.260 UTC [17944] LOG:  listening on IPv6 address \"::\", port 5433
17:20:28 17936 FATAL 2022-03-25 17:20:18.267 UTC [17944] FATAL:  lock file \"/var/run/postgresql/.s.PGSQL.5433.lock\" already exists
17:20:28 17936 ERROR 2022-03-25 17:20:18.267 UTC [17944] HINT:  Is another postmaster (PID 17083) using socket file \"/var/run/postgresql/.s.PGSQL.5433\"?
17:20:28 17936 ERROR 2022-03-25 17:20:18.268 UTC [17944] LOG:  database system is shut down
17:20:28 17936 WARN  Failed to start Postgres instance at \"/var/lib/postgresql/14/main_cluster\"
17:20:28 17917 ERROR pg_autoctl service monitor-init exited with exit status 12
17:20:28 17917 INFO  Restarting service monitor-init
17:20:28 17951 INFO   /usr/lib/postgresql/14/bin/postgres -D /var/lib/postgresql/14/main_cluster -p 5433 -h *
17:20:38 17949 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory
17:20:38 17949 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?
17:20:38 17949 ERROR Failed to get Postgres pid, see above for details
17:20:38 17949 ERROR Failed to ensure that Postgres is running in \"/var/lib/postgresql/14/main_cluster\"
17:20:38 17949 ERROR Failed to install pg_auto_failover in the monitor's Postgres database, see above for details
17:20:38 17936 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory
17:20:38 17936 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?
17:20:38 17936 ERROR Failed to get Postgres pid, see above for details
17:20:38 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/startup.log\":
17:20:38 17936 ERROR 2022-03-25 17:20:28.257 UTC [17951] LOG:  redirecting log output to logging collector process
17:20:38 17936 ERROR 2022-03-25 17:20:28.257 UTC [17951] HINT:  Future log output will appear in directory \"log\".
17:20:38 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/log/postgresql-2022-03-25_172028.log\":
17:20:38 17936 ERROR 2022-03-25 17:20:28.257 UTC [17951] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
17:20:38 17936 ERROR 2022-03-25 17:20:28.257 UTC [17951] LOG:  could not bind IPv4 address \"0.0.0.0\": Address already in use
17:20:38 17936 ERROR 2022-03-25 17:20:28.257 UTC [17951] HINT:  Is another postmaster already running on port 5433? If not, wait a few seconds and retry.
17:20:38 17936 ERROR 2022-03-25 17:20:28.257 UTC [17951] LOG:  listening on IPv6 address \"::\", port 5433
17:20:38 17936 FATAL 2022-03-25 17:20:28.262 UTC [17951] FATAL:  lock file \"/var/run/postgresql/.s.PGSQL.5433.lock\" already exists
17:20:38 17936 ERROR 2022-03-25 17:20:28.262 UTC [17951] HINT:  Is another postmaster (PID 17083) using socket file \"/var/run/postgresql/.s.PGSQL.5433\"?
17:20:38 17936 ERROR 2022-03-25 17:20:28.263 UTC [17951] LOG:  database system is shut down
17:20:38 17936 WARN  Failed to start Postgres instance at \"/var/lib/postgresql/14/main_cluster\"
17:20:38 17917 ERROR pg_autoctl service monitor-init exited with exit status 12
17:20:38 17917 INFO  Restarting service monitor-init
17:20:38 17958 INFO   /usr/lib/postgresql/14/bin/postgres -D /var/lib/postgresql/14/main_cluster -p 5433 -h *
17:20:48 17955 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory
17:20:48 17955 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?
17:20:48 17955 ERROR Failed to get Postgres pid, see above for details
17:20:48 17955 ERROR Failed to ensure that Postgres is running in \"/var/lib/postgresql/14/main_cluster\"
17:20:48 17955 ERROR Failed to install pg_auto_failover in the monitor's Postgres database, see above for details
17:20:48 17936 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory
17:20:48 17936 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?
17:20:48 17936 ERROR Failed to get Postgres pid, see above for details
17:20:48 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/startup.log\":
17:20:48 17936 ERROR 2022-03-25 17:20:38.273 UTC [17958] LOG:  redirecting log output to logging collector process
17:20:48 17936 ERROR 2022-03-25 17:20:38.273 UTC [17958] HINT:  Future log output will appear in directory \"log\".
17:20:48 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/log/postgresql-2022-03-25_172038.log\":
17:20:48 17936 ERROR 2022-03-25 17:20:38.273 UTC [17958] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
17:20:48 17936 ERROR 2022-03-25 17:20:38.273 UTC [17958] LOG:  could not bind IPv4 address \"0.0.0.0\": Address already in use
17:20:48 17936 ERROR 2022-03-25 17:20:38.273 UTC [17958] HINT:  Is another postmaster already running on port 5433? If not, wait a few seconds and retry.
17:20:48 17936 ERROR 2022-03-25 17:20:38.273 UTC [17958] LOG:  listening on IPv6 address \"::\", port 5433
17:20:48 17936 FATAL 2022-03-25 17:20:38.279 UTC [17958] FATAL:  lock file \"/var/run/postgresql/.s.PGSQL.5433.lock\" already exists
17:20:48 17936 ERROR 2022-03-25 17:20:38.279 UTC [17958] HINT:  Is another postmaster (PID 17083) using socket file \"/var/run/postgresql/.s.PGSQL.5433\"?
17:20:48 17936 ERROR 2022-03-25 17:20:38.280 UTC [17958] LOG:  database system is shut down
17:20:48 17936 WARN  Failed to start Postgres instance at \"/var/lib/postgresql/14/main_cluster\"
17:20:48 17917 ERROR pg_autoctl service monitor-init exited with exit status 12
17:20:48 17917 INFO  Restarting service monitor-init
17:20:48 17963 INFO   /usr/lib/postgresql/14/bin/postgres -D /var/lib/postgresql/14/main_cluster -p 5433 -h *
17:20:58 17936 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory
17:20:58 17936 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?
17:20:58 17936 ERROR Failed to get Postgres pid, see above for details
17:20:58 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/startup.log\":
17:20:58 17936 ERROR 2022-03-25 17:20:48.514 UTC [17963] LOG:  redirecting log output to logging collector process
17:20:58 17936 ERROR 2022-03-25 17:20:48.514 UTC [17963] HINT:  Future log output will appear in directory \"log\".
17:20:58 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/log/postgresql-2022-03-25_172048.log\":
17:20:58 17936 ERROR 2022-03-25 17:20:48.514 UTC [17963] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
17:20:58 17936 ERROR 2022-03-25 17:20:48.514 UTC [17963] LOG:  could not bind IPv4 address \"0.0.0.0\": Address already in use
17:20:58 17936 ERROR 2022-03-25 17:20:48.514 UTC [17963] HINT:  Is another postmaster already running on port 5433? If not, wait a few seconds and retry.
17:20:58 17936 ERROR 2022-03-25 17:20:48.514 UTC [17963] LOG:  listening on IPv6 address \"::\", port 5433
17:20:58 17936 FATAL 2022-03-25 17:20:48.520 UTC [17963] FATAL:  lock file \"/var/run/postgresql/.s.PGSQL.5433.lock\" already exists
17:20:58 17936 ERROR 2022-03-25 17:20:48.520 UTC [17963] HINT:  Is another postmaster (PID 17083) using socket file \"/var/run/postgresql/.s.PGSQL.5433\"?
17:20:58 17936 ERROR 2022-03-25 17:20:48.521 UTC [17963] LOG:  database system is shut down
17:20:58 17936 WARN  Failed to start Postgres instance at \"/var/lib/postgresql/14/main_cluster\"
17:20:58 17960 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory
17:20:58 17960 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?
17:20:58 17960 ERROR Failed to get Postgres pid, see above for details
17:20:58 17960 ERROR Failed to ensure that Postgres is running in \"/var/lib/postgresql/14/main_cluster\"
17:20:58 17960 ERROR Failed to install pg_auto_failover in the monitor's Postgres database, see above for details
17:20:58 17966 INFO   /usr/lib/postgresql/14/bin/postgres -D /var/lib/postgresql/14/main_cluster -p 5433 -h *
17:20:58 17917 ERROR pg_autoctl service monitor-init exited with exit status 12
17:20:58 17917 INFO  Restarting service monitor-init
17:21:08 17967 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory
17:21:08 17967 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?
17:21:08 17967 ERROR Failed to get Postgres pid, see above for details
17:21:08 17967 ERROR Failed to ensure that Postgres is running in \"/var/lib/postgresql/14/main_cluster\"
17:21:08 17967 ERROR Failed to install pg_auto_failover in the monitor's Postgres database, see above for details
17:21:08 17917 ERROR pg_autoctl service monitor-init exited with exit status 12
17:21:08 17917 FATAL pg_autoctl service monitor-init has already been restarted 5 times in the last 50 seconds, stopping now
17:21:08 17936 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory
17:21:08 17936 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?
17:21:08 17936 ERROR Failed to get Postgres pid, see above for details
17:21:08 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/startup.log\":
17:21:08 17936 ERROR 2022-03-25 17:20:58.313 UTC [17966] LOG:  redirecting log output to logging collector process
17:21:08 17936 ERROR 2022-03-25 17:20:58.313 UTC [17966] HINT:  Future log output will appear in directory \"log\".
17:21:08 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/log/postgresql-2022-03-25_172058.log\":
17:21:08 17936 ERROR 2022-03-25 17:20:58.313 UTC [17966] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
17:21:08 17936 ERROR 2022-03-25 17:20:58.313 UTC [17966] LOG:  could not bind IPv4 address \"0.0.0.0\": Address already in use
17:21:08 17936 ERROR 2022-03-25 17:20:58.313 UTC [17966] HINT:  Is another postmaster already running on port 5433? If not, wait a few seconds and retry.
17:21:08 17936 ERROR 2022-03-25 17:20:58.313 UTC [17966] LOG:  listening on IPv6 address \"::\", port 5433
17:21:08 17936 FATAL 2022-03-25 17:20:58.318 UTC [17966] FATAL:  lock file \"/var/run/postgresql/.s.PGSQL.5433.lock\" already exists
17:21:08 17936 ERROR 2022-03-25 17:20:58.318 UTC [17966] HINT:  Is another postmaster (PID 17083) using socket file \"/var/run/postgresql/.s.PGSQL.5433\"?
17:21:08 17936 ERROR 2022-03-25 17:20:58.319 UTC [17966] LOG:  database system is shut down
17:21:08 17936 WARN  Failed to start Postgres instance at \"/var/lib/postgresql/14/main_cluster\"
17:21:08 17936 INFO  Postgres controller service received signal SIGTERM, terminating
17:21:09 17917 FATAL Something went wrong in sub-process supervision, stopping now. See above for details.
17:21:09 17917 INFO  Stop pg_autoctl", "stderr_lines": ["17:20:07 17917 INFO  Using default --ssl-mode \"verify-full\"", "17:20:07 17917 INFO  Initialising a PostgreSQL cluster at \"/var/lib/postgresql/14/main_cluster\"", "17:20:07 17917 INFO  /usr/lib/postgresql/14/bin/pg_ctl initdb -s -D /var/lib/postgresql/14/main_cluster --option '--auth=trust'", "17:20:08 17917 INFO  Started pg_autoctl postgres service with pid 17936", "17:20:08 17936 INFO   /usr/bin/pg_autoctl do service postgres --pgdata /var/lib/postgresql/14/main_cluster -v", "17:20:08 17917 INFO  Started pg_autoctl monitor-init service with pid 17937", "17:20:08 17941 INFO   /usr/lib/postgresql/14/bin/postgres -D /var/lib/postgresql/14/main_cluster -p 5433 -h *", "17:20:18 17936 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory", "17:20:18 17936 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?", "17:20:18 17936 ERROR Failed to get Postgres pid, see above for details", "17:20:18 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/startup.log\":", "17:20:18 17936 ERROR 2022-03-25 17:20:08.612 UTC [17941] LOG:  redirecting log output to logging collector process", "17:20:18 17936 ERROR 2022-03-25 17:20:08.612 UTC [17941] HINT:  Future log output will appear in directory \"log\".", "17:20:18 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/log/postgresql-2022-03-25_172008.log\":", "17:20:18 17936 ERROR 2022-03-25 17:20:08.612 UTC [17941] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit", "17:20:18 17936 ERROR 2022-03-25 17:20:08.612 UTC [17941] LOG:  could not bind IPv4 address \"0.0.0.0\": Address already in use", "17:20:18 17936 ERROR 2022-03-25 17:20:08.612 UTC [17941] HINT:  Is another postmaster already running on port 5433? If not, wait a few seconds and retry.", "17:20:18 17936 ERROR 2022-03-25 17:20:08.612 UTC [17941] LOG:  listening on IPv6 address \"::\", port 5433", "17:20:18 17936 FATAL 2022-03-25 17:20:08.618 UTC [17941] FATAL:  lock file \"/var/run/postgresql/.s.PGSQL.5433.lock\" already exists", "17:20:18 17936 ERROR 2022-03-25 17:20:08.618 UTC [17941] HINT:  Is another postmaster (PID 17083) using socket file \"/var/run/postgresql/.s.PGSQL.5433\"?", "17:20:18 17936 ERROR 2022-03-25 17:20:08.619 UTC [17941] LOG:  database system is shut down", "17:20:18 17936 WARN  Failed to start Postgres instance at \"/var/lib/postgresql/14/main_cluster\"", "17:20:18 17937 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory", "17:20:18 17937 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?", "17:20:18 17937 ERROR Failed to get Postgres pid, see above for details", "17:20:18 17937 ERROR Failed to ensure that Postgres is running in \"/var/lib/postgresql/14/main_cluster\"", "17:20:18 17937 ERROR Failed to install pg_auto_failover in the monitor's Postgres database, see above for details", "17:20:18 17944 INFO   /usr/lib/postgresql/14/bin/postgres -D /var/lib/postgresql/14/main_cluster -p 5433 -h *", "17:20:18 17917 ERROR pg_autoctl service monitor-init exited with exit status 12", "17:20:18 17917 INFO  Restarting service monitor-init", "17:20:28 17936 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory", "17:20:28 17945 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory", "17:20:28 17936 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?", "17:20:28 17945 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?", "17:20:28 17936 ERROR Failed to get Postgres pid, see above for details", "17:20:28 17945 ERROR Failed to get Postgres pid, see above for details", "17:20:28 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/startup.log\":", "17:20:28 17945 ERROR Failed to ensure that Postgres is running in \"/var/lib/postgresql/14/main_cluster\"", "17:20:28 17936 ERROR 2022-03-25 17:20:18.259 UTC [17944] LOG:  redirecting log output to logging collector process", "17:20:28 17945 ERROR Failed to install pg_auto_failover in the monitor's Postgres database, see above for details", "17:20:28 17936 ERROR 2022-03-25 17:20:18.259 UTC [17944] HINT:  Future log output will appear in directory \"log\".", "17:20:28 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/log/postgresql-2022-03-25_172018.log\":", "17:20:28 17936 ERROR 2022-03-25 17:20:18.259 UTC [17944] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit", "17:20:28 17936 ERROR 2022-03-25 17:20:18.260 UTC [17944] LOG:  could not bind IPv4 address \"0.0.0.0\": Address already in use", "17:20:28 17936 ERROR 2022-03-25 17:20:18.260 UTC [17944] HINT:  Is another postmaster already running on port 5433? If not, wait a few seconds and retry.", "17:20:28 17936 ERROR 2022-03-25 17:20:18.260 UTC [17944] LOG:  listening on IPv6 address \"::\", port 5433", "17:20:28 17936 FATAL 2022-03-25 17:20:18.267 UTC [17944] FATAL:  lock file \"/var/run/postgresql/.s.PGSQL.5433.lock\" already exists", "17:20:28 17936 ERROR 2022-03-25 17:20:18.267 UTC [17944] HINT:  Is another postmaster (PID 17083) using socket file \"/var/run/postgresql/.s.PGSQL.5433\"?", "17:20:28 17936 ERROR 2022-03-25 17:20:18.268 UTC [17944] LOG:  database system is shut down", "17:20:28 17936 WARN  Failed to start Postgres instance at \"/var/lib/postgresql/14/main_cluster\"", "17:20:28 17917 ERROR pg_autoctl service monitor-init exited with exit status 12", "17:20:28 17917 INFO  Restarting service monitor-init", "17:20:28 17951 INFO   /usr/lib/postgresql/14/bin/postgres -D /var/lib/postgresql/14/main_cluster -p 5433 -h *", "17:20:38 17949 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory", "17:20:38 17949 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?", "17:20:38 17949 ERROR Failed to get Postgres pid, see above for details", "17:20:38 17949 ERROR Failed to ensure that Postgres is running in \"/var/lib/postgresql/14/main_cluster\"", "17:20:38 17949 ERROR Failed to install pg_auto_failover in the monitor's Postgres database, see above for details", "17:20:38 17936 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory", "17:20:38 17936 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?", "17:20:38 17936 ERROR Failed to get Postgres pid, see above for details", "17:20:38 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/startup.log\":", "17:20:38 17936 ERROR 2022-03-25 17:20:28.257 UTC [17951] LOG:  redirecting log output to logging collector process", "17:20:38 17936 ERROR 2022-03-25 17:20:28.257 UTC [17951] HINT:  Future log output will appear in directory \"log\".", "17:20:38 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/log/postgresql-2022-03-25_172028.log\":", "17:20:38 17936 ERROR 2022-03-25 17:20:28.257 UTC [17951] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit", "17:20:38 17936 ERROR 2022-03-25 17:20:28.257 UTC [17951] LOG:  could not bind IPv4 address \"0.0.0.0\": Address already in use", "17:20:38 17936 ERROR 2022-03-25 17:20:28.257 UTC [17951] HINT:  Is another postmaster already running on port 5433? If not, wait a few seconds and retry.", "17:20:38 17936 ERROR 2022-03-25 17:20:28.257 UTC [17951] LOG:  listening on IPv6 address \"::\", port 5433", "17:20:38 17936 FATAL 2022-03-25 17:20:28.262 UTC [17951] FATAL:  lock file \"/var/run/postgresql/.s.PGSQL.5433.lock\" already exists", "17:20:38 17936 ERROR 2022-03-25 17:20:28.262 UTC [17951] HINT:  Is another postmaster (PID 17083) using socket file \"/var/run/postgresql/.s.PGSQL.5433\"?", "17:20:38 17936 ERROR 2022-03-25 17:20:28.263 UTC [17951] LOG:  database system is shut down", "17:20:38 17936 WARN  Failed to start Postgres instance at \"/var/lib/postgresql/14/main_cluster\"", "17:20:38 17917 ERROR pg_autoctl service monitor-init exited with exit status 12", "17:20:38 17917 INFO  Restarting service monitor-init", "17:20:38 17958 INFO   /usr/lib/postgresql/14/bin/postgres -D /var/lib/postgresql/14/main_cluster -p 5433 -h *", "17:20:48 17955 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory", "17:20:48 17955 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?", "17:20:48 17955 ERROR Failed to get Postgres pid, see above for details", "17:20:48 17955 ERROR Failed to ensure that Postgres is running in \"/var/lib/postgresql/14/main_cluster\"", "17:20:48 17955 ERROR Failed to install pg_auto_failover in the monitor's Postgres database, see above for details", "17:20:48 17936 ERROR Failed to open file \"/var/lib/postgresql/14/main_cluster/postmaster.pid\": No such file or directory", "17:20:48 17936 INFO  Is PostgreSQL at \"/var/lib/postgresql/14/main_cluster\" up and running?", "17:20:48 17936 ERROR Failed to get Postgres pid, see above for details", "17:20:48 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/startup.log\":", "17:20:48 17936 ERROR 2022-03-25 17:20:38.273 UTC [17958] LOG:  redirecting log output to logging collector process", "17:20:48 17936 ERROR 2022-03-25 17:20:38.273 UTC [17958] HINT:  Future log output will appear in directory \"log\".", "17:20:48 17936 WARN  Postgres logs from \"/var/lib/postgresql/14/main_cluster/log/postgresql-2022-03-25_172038.log\":", "17:20:48 17936 ERROR 2022-03-25 17:20:38.273 UTC [17958] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit", "17:20:48 17936 ERROR 2022-03-25 17:20:38.273 UTC [17958] LOG:  could not bind IPv4 address \"0.0.0.0\": Address already in use", "17:20:48 17936 ERROR 2022-03-25 17:20:38.273 UTC [17958] HINT:  Is another postmaster already running on port 5433? If not, wait a few seconds and retry.", "17:20:48 17936 ERROR 2022-03-25 17:20:38.273 UTC [17958] LOG:  listening on IPv6 address \"::\", port 5433", "17:20:48 17936 FATAL 2022-03-25 17:20:38.279 UTC [17958] FATAL:  lock file \"/var/run/postgresql/.s.PGSQL.5433.lock\" already exists", "17:20:48 17936 ERROR 2022-03-25 17:20:38.279 UTC [17958] HINT:  Is another postmaster (PID 17083) using socket file \"/var/run/postgresql/.s.PGSQL.5433\"?", "17:20:48 17936 ERROR 2022-03-25 17:20:38.280 UTC [17958] LOG:  database system is shut down", "17:20:48 17936 WARN  Failed to start Postgres instance at \"/var/lib/postgresql/14/main_cluster\"", "17:20:48 17917 ERROR pg_autoctl service monitor-init exited with exit status 12", "17:20:48 17917 INFO  Restarting service monitor-init", "17:20:48
....

We are running on a non standard port (5433).

It seems as if the first try to start postgres is marked as failed and then we try again, but then the try that was marked as failed is up and running and all subsequent tries fail again.

s4ke commented 2 years ago

These are the logs from the log/ dir:

2022-03-25 17:20:08.612 UTC [17941] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
2022-03-25 17:20:08.612 UTC [17941] LOG:  could not bind IPv4 address "0.0.0.0": Address already in use
2022-03-25 17:20:08.612 UTC [17941] HINT:  Is another postmaster already running on port 5433? If not, wait a few seconds and retry.
2022-03-25 17:20:08.612 UTC [17941] LOG:  listening on IPv6 address "::", port 5433
2022-03-25 17:20:08.618 UTC [17941] FATAL:  lock file "/var/run/postgresql/.s.PGSQL.5433.lock" already exists
2022-03-25 17:20:08.618 UTC [17941] HINT:  Is another postmaster (PID 17083) using socket file "/var/run/postgresql/.s.PGSQL.5433"?
2022-03-25 17:20:08.619 UTC [17941] LOG:  database system is shut down
2022-03-25 17:20:18.259 UTC [17944] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
2022-03-25 17:20:18.260 UTC [17944] LOG:  could not bind IPv4 address "0.0.0.0": Address already in use
2022-03-25 17:20:18.260 UTC [17944] HINT:  Is another postmaster already running on port 5433? If not, wait a few seconds and retry.
2022-03-25 17:20:18.260 UTC [17944] LOG:  listening on IPv6 address "::", port 5433
2022-03-25 17:20:18.267 UTC [17944] FATAL:  lock file "/var/run/postgresql/.s.PGSQL.5433.lock" already exists
2022-03-25 17:20:18.267 UTC [17944] HINT:  Is another postmaster (PID 17083) using socket file "/var/run/postgresql/.s.PGSQL.5433"?
2022-03-25 17:20:18.268 UTC [17944] LOG:  database system is shut down
2022-03-25 17:20:28.257 UTC [17951] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
2022-03-25 17:20:28.257 UTC [17951] LOG:  could not bind IPv4 address "0.0.0.0": Address already in use
2022-03-25 17:20:28.257 UTC [17951] HINT:  Is another postmaster already running on port 5433? If not, wait a few seconds and retry.
2022-03-25 17:20:28.257 UTC [17951] LOG:  listening on IPv6 address "::", port 5433
2022-03-25 17:20:28.262 UTC [17951] FATAL:  lock file "/var/run/postgresql/.s.PGSQL.5433.lock" already exists
2022-03-25 17:20:28.262 UTC [17951] HINT:  Is another postmaster (PID 17083) using socket file "/var/run/postgresql/.s.PGSQL.5433"?
2022-03-25 17:20:28.263 UTC [17951] LOG:  database system is shut down
2022-03-25 17:20:38.273 UTC [17958] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
2022-03-25 17:20:38.273 UTC [17958] LOG:  could not bind IPv4 address "0.0.0.0": Address already in use
2022-03-25 17:20:38.273 UTC [17958] HINT:  Is another postmaster already running on port 5433? If not, wait a few seconds and retry.
2022-03-25 17:20:38.273 UTC [17958] LOG:  listening on IPv6 address "::", port 5433
2022-03-25 17:20:38.279 UTC [17958] FATAL:  lock file "/var/run/postgresql/.s.PGSQL.5433.lock" already exists
2022-03-25 17:20:38.279 UTC [17958] HINT:  Is another postmaster (PID 17083) using socket file "/var/run/postgresql/.s.PGSQL.5433"?
2022-03-25 17:20:38.280 UTC [17958] LOG:  database system is shut down
2022-03-25 17:20:48.514 UTC [17963] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
2022-03-25 17:20:48.514 UTC [17963] LOG:  could not bind IPv4 address "0.0.0.0": Address already in use
2022-03-25 17:20:48.514 UTC [17963] HINT:  Is another postmaster already running on port 5433? If not, wait a few seconds and retry.
2022-03-25 17:20:48.514 UTC [17963] LOG:  listening on IPv6 address "::", port 5433
2022-03-25 17:20:48.520 UTC [17963] FATAL:  lock file "/var/run/postgresql/.s.PGSQL.5433.lock" already exists
2022-03-25 17:20:48.520 UTC [17963] HINT:  Is another postmaster (PID 17083) using socket file "/var/run/postgresql/.s.PGSQL.5433"?
2022-03-25 17:20:48.521 UTC [17963] LOG:  database system is shut down
2022-03-25 17:20:58.313 UTC [17966] LOG:  starting PostgreSQL 14.2 (Ubuntu 14.2-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
2022-03-25 17:20:58.313 UTC [17966] LOG:  could not bind IPv4 address "0.0.0.0": Address already in use
2022-03-25 17:20:58.313 UTC [17966] HINT:  Is another postmaster already running on port 5433? If not, wait a few seconds and retry.
2022-03-25 17:20:58.313 UTC [17966] LOG:  listening on IPv6 address "::", port 5433
2022-03-25 17:20:58.318 UTC [17966] FATAL:  lock file "/var/run/postgresql/.s.PGSQL.5433.lock" already exists
2022-03-25 17:20:58.318 UTC [17966] HINT:  Is another postmaster (PID 17083) using socket file "/var/run/postgresql/.s.PGSQL.5433"?
2022-03-25 17:20:58.319 UTC [17966] LOG:  database system is shut down
s4ke commented 2 years ago

closing this as it seems unrelated

s4ke commented 2 years ago

issue was caused by postgres 10 somehow being initialized and running:

grafik

s4ke commented 2 years ago

Ah, this seems to be caused by installing pg-auto-failover-cli-1.6 before installing postgresql-14-auto-failover-1.6

On Ubuntu 18.04 this automatically pulls in:

grafik

s4ke commented 2 years ago

Is this correct?

grafik

s4ke commented 2 years ago

postgresql-12-auto-failover-1.6 is fine:

grafik

same goes for postgresql-13-auto-failover-1.6:

grafik

s4ke commented 2 years ago

If it helps, this was from a Ubuntu 20.04 machine which had a related error:

grafik

There we had to reorder the installation so that the cli got installed first, but there was no postgres-10 pulled in:

grafik

DimCitus commented 2 years ago

Hi @s4ke ; thanks for diving into the issue that deep. It looks like a debian package dependency issue. Are you using the package from Citus or the package from apt.postgresql.org? The latter is maintained at https://github.com/dimitri/pgaf_debian and PRs are welcome of course!

DimCitus commented 2 years ago

Is it possible that the trouble is from Ubuntu installing the Recommends packages automatically?

s4ke commented 2 years ago

Got this from a similar host, but essentially we are using the file from https://install.citusdata.com/community/deb.sh to configure the apt packages after also setting up the postgres package repo.

 500 https://repos.citusdata.com/community/ubuntu bionic/main amd64 Packages
     release v=1,o=packagecloud.io/citusdata/community,a=bionic,n=bionic,l=community,c=main,b=amd64
     origin repos.citusdata.com
 500 http://apt.postgresql.org/pub/repos/apt bionic-pgdg/main amd64 Packages
     release o=apt.postgresql.org,a=bionic-pgdg,n=bionic-pgdg,l=PostgreSQL for Debian/Ubuntu repository,c=main,b=amd64
     origin apt.postgresql.org

Additionally, we are using these policies:

grafik