Closed kevinelliott closed 3 years ago
Hi @kevinelliott ; it seems that you have more than one Postgres setup available on your machine, and I am not sure why. As I happen to be running some QA testing today on debian VMs, I could have a look by myself on those instances and here is what I am finding:
ha-admin@ha-demo-dim-paris-b:~$ which -a pg_config | xargs ls -l
-rwxr-xr-x 1 root root 1229 Nov 5 2018 /bin/pg_config
-rwxr-xr-x 1 root root 1229 Nov 5 2018 /usr/bin/pg_config
ha-admin@ha-demo-dim-paris-b:~$ dpkg -S bin/pg_config
diversion by postgresql-common from: /usr/bin/pg_config
diversion by postgresql-common to: /usr/bin/pg_config.libpq-dev
diversion by postgresql-common from: /usr/bin/pg_config
diversion by postgresql-common to: /usr/bin/pg_config.libpq-dev
postgresql-common: /usr/bin/pg_config
postgresql-client-12: /usr/lib/postgresql/12/bin/pg_config
ha-admin@ha-demo-dim-paris-b:~$ ls -ld /bin /usr/bin
lrwxrwxrwx 1 root root 7 Oct 23 04:21 /bin -> usr/bin
drwxr-xr-x 2 root root 20480 Dec 7 10:50 /usr/bin
So debian now defaults to using a single place for things and a symlink to support legacy PATH expectations with /bin
, which is breaking our assumptions in pg_autoctl
.
The practical answer I can give you is that you need to either export a PG_CONFIG entry in your environment, as per the log output HINT you pasted above, or use the --pgctl
option to pg_autoctl create monitor --pgctl /usr/lib/postgresql/11/bin/pg_ctl --ssl-self-signed ...
.
@kevinelliott Based on your logs there also seems to be a second issue, you install postgresql-11-auto-failover
instead of postgresql-13-auto-failover
. The first one is for PG11 and the second one for PG13. Based on your logs you already have PG13 installed.
Yes, so first I followed the directions to a T... when they didn't work, I noticed that postgresql-11-auto-failover
was installed in the docs, so I attempted to install postgresql-13-auto-failover
in case it would improve the situation. It didn't. Then I went and uninstalled both and went back and tried postgresql-11-auto-failover
again.
Hi @kevinelliott ; it seems that you have more than one Postgres setup available on your machine, and I am not sure why. As I happen to be running some QA testing today on debian VMs, I could have a look by myself on those instances and here is what I am finding:
ha-admin@ha-demo-dim-paris-b:~$ which -a pg_config | xargs ls -l -rwxr-xr-x 1 root root 1229 Nov 5 2018 /bin/pg_config -rwxr-xr-x 1 root root 1229 Nov 5 2018 /usr/bin/pg_config ha-admin@ha-demo-dim-paris-b:~$ dpkg -S bin/pg_config diversion by postgresql-common from: /usr/bin/pg_config diversion by postgresql-common to: /usr/bin/pg_config.libpq-dev diversion by postgresql-common from: /usr/bin/pg_config diversion by postgresql-common to: /usr/bin/pg_config.libpq-dev postgresql-common: /usr/bin/pg_config postgresql-client-12: /usr/lib/postgresql/12/bin/pg_config ha-admin@ha-demo-dim-paris-b:~$ ls -ld /bin /usr/bin lrwxrwxrwx 1 root root 7 Oct 23 04:21 /bin -> usr/bin drwxr-xr-x 2 root root 20480 Dec 7 10:50 /usr/bin
So debian now defaults to using a single place for things and a symlink to support legacy PATH expectations with
/bin
, which is breaking our assumptions inpg_autoctl
.The practical answer I can give you is that you need to either export a PG_CONFIG entry in your environment, as per the log output HINT you pasted above, or use the
--pgctl
option topg_autoctl create monitor --pgctl /usr/lib/postgresql/11/bin/pg_ctl --ssl-self-signed ...
.
Thanks @DimCitus I will give that a try. Do you think future support will automatically detect this instead?
Progress, but now there is an issue with the run.
kelliott@af-db-controller:~$ pg_autoctl create monitor --ssl-self-signed --hostname 10.90.31.20 --auth trust --run --pgctl /usr/lib/postgresql/13/bin/pg_ctl
22:59:25 171624 INFO Using default --ssl-mode "require"
22:59:25 171624 INFO Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic
22:59:25 171624 WARN Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.
22:59:25 171624 WARN See https://www.postgresql.org/docs/current/libpq-ssl.html for details
22:59:25 171624 INFO Initialising a PostgreSQL cluster at "./monitor"
22:59:25 171624 INFO /usr/lib/postgresql/13/bin/pg_ctl initdb -s -D ./monitor --option '--auth=trust'
22:59:26 171624 INFO /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /home/kelliott/monitor/server.crt -keyout /home/kelliott/monitor/server.key -subj "/CN=10.90.31.20"
22:59:26 171624 INFO Started pg_autoctl postgres service with pid 171644
22:59:26 171644 INFO /usr/bin/pg_autoctl do service postgres --pgdata ./monitor -v
22:59:26 171624 INFO Started pg_autoctl listener service with pid 171645
22:59:26 171650 INFO /usr/lib/postgresql/13/bin/postgres -D /home/kelliott/monitor -p 5000 -h *
22:59:36 171645 ERROR Failed to open file "/home/kelliott/monitor/postmaster.pid": No such file or directory
22:59:36 171645 INFO Is PostgreSQL at "/home/kelliott/monitor" up and running?
22:59:36 171645 ERROR Failed to get Postgres pid, see above for details
22:59:36 171645 ERROR Failed to ensure that Postgres is running in "/home/kelliott/monitor"
22:59:36 171645 ERROR Failed to install pg_auto_failover in the monitor's Postgres database, see above for details
22:59:36 171624 ERROR pg_autoctl service listener exited with exit status 12
22:59:36 171624 INFO Restarting service listener
22:59:36 171644 ERROR Failed to open file "/home/kelliott/monitor/postmaster.pid": No such file or directory
22:59:36 171644 INFO Is PostgreSQL at "/home/kelliott/monitor" up and running?
22:59:36 171644 ERROR Failed to get Postgres pid, see above for details
22:59:36 171644 WARN Postgres logs from "/home/kelliott/monitor/startup.log":
22:59:36 171644 ERROR 2020-12-08 22:59:26.539 UTC [171650] LOG: redirecting log output to logging collector process
22:59:36 171644 ERROR 2020-12-08 22:59:26.539 UTC [171650] HINT: Future log output will appear in directory "log".
22:59:36 171644 WARN Postgres logs from "/home/kelliott/monitor/log/postgresql-2020-12-08_225926.log":
22:59:36 171644 ERROR 2020-12-08 22:59:26.539 UTC [171650] LOG: starting PostgreSQL 13.1 (Ubuntu 13.1-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
22:59:36 171644 ERROR 2020-12-08 22:59:26.539 UTC [171650] LOG: listening on IPv4 address "0.0.0.0", port 5000
22:59:36 171644 ERROR 2020-12-08 22:59:26.540 UTC [171650] LOG: listening on IPv6 address "::", port 5000
22:59:36 171644 FATAL 2020-12-08 22:59:26.542 UTC [171650] FATAL: could not create lock file "/var/run/postgresql/.s.PGSQL.5000.lock": Permission denied
22:59:36 171644 ERROR 2020-12-08 22:59:26.545 UTC [171650] LOG: database system is shut down
22:59:36 171644 WARN Failed to start Postgres instance at "/home/kelliott/monitor"
22:59:36 171659 INFO /usr/lib/postgresql/13/bin/postgres -D /home/kelliott/monitor -p 5000 -h *
22:59:46 171656 ERROR Failed to open file "/home/kelliott/monitor/postmaster.pid": No such file or directory
22:59:46 171656 INFO Is PostgreSQL at "/home/kelliott/monitor" up and running?
22:59:46 171656 ERROR Failed to get Postgres pid, see above for details
22:59:46 171656 ERROR Failed to ensure that Postgres is running in "/home/kelliott/monitor"
22:59:46 171656 ERROR Failed to install pg_auto_failover in the monitor's Postgres database, see above for details
22:59:46 171644 ERROR Failed to open file "/home/kelliott/monitor/postmaster.pid": No such file or directory
22:59:46 171644 INFO Is PostgreSQL at "/home/kelliott/monitor" up and running?
22:59:46 171644 ERROR Failed to get Postgres pid, see above for details
22:59:46 171644 WARN Postgres logs from "/home/kelliott/monitor/startup.log":
22:59:46 171644 ERROR 2020-12-08 22:59:36.363 UTC [171659] LOG: redirecting log output to logging collector process
22:59:46 171644 ERROR 2020-12-08 22:59:36.363 UTC [171659] HINT: Future log output will appear in directory "log".
22:59:46 171644 WARN Postgres logs from "/home/kelliott/monitor/log/postgresql-2020-12-08_225936.log":
22:59:46 171644 ERROR 2020-12-08 22:59:36.364 UTC [171659] LOG: starting PostgreSQL 13.1 (Ubuntu 13.1-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
22:59:46 171644 ERROR 2020-12-08 22:59:36.364 UTC [171659] LOG: listening on IPv4 address "0.0.0.0", port 5000
22:59:46 171644 ERROR 2020-12-08 22:59:36.364 UTC [171659] LOG: listening on IPv6 address "::", port 5000
22:59:46 171644 FATAL 2020-12-08 22:59:36.367 UTC [171659] FATAL: could not create lock file "/var/run/postgresql/.s.PGSQL.5000.lock": Permission denied
22:59:46 171644 ERROR 2020-12-08 22:59:36.370 UTC [171659] LOG: database system is shut down
22:59:46 171644 WARN Failed to start Postgres instance at "/home/kelliott/monitor"
22:59:46 171624 ERROR pg_autoctl service listener exited with exit status 12
22:59:46 171624 INFO Restarting service listener
22:59:46 171668 INFO /usr/lib/postgresql/13/bin/postgres -D /home/kelliott/monitor -p 5000 -h *
22:59:56 171665 ERROR Failed to open file "/home/kelliott/monitor/postmaster.pid": No such file or directory
22:59:56 171665 INFO Is PostgreSQL at "/home/kelliott/monitor" up and running?
22:59:56 171665 ERROR Failed to get Postgres pid, see above for details
22:59:56 171665 ERROR Failed to ensure that Postgres is running in "/home/kelliott/monitor"
22:59:56 171665 ERROR Failed to install pg_auto_failover in the monitor's Postgres database, see above for details
22:59:56 171644 ERROR Failed to open file "/home/kelliott/monitor/postmaster.pid": No such file or directory
22:59:56 171644 INFO Is PostgreSQL at "/home/kelliott/monitor" up and running?
22:59:56 171644 ERROR Failed to get Postgres pid, see above for details
22:59:56 171644 WARN Postgres logs from "/home/kelliott/monitor/startup.log":
22:59:56 171644 ERROR 2020-12-08 22:59:46.285 UTC [171668] LOG: redirecting log output to logging collector process
22:59:56 171644 ERROR 2020-12-08 22:59:46.285 UTC [171668] HINT: Future log output will appear in directory "log".
22:59:56 171644 WARN Postgres logs from "/home/kelliott/monitor/log/postgresql-2020-12-08_225946.log":
22:59:56 171644 ERROR 2020-12-08 22:59:46.285 UTC [171668] LOG: starting PostgreSQL 13.1 (Ubuntu 13.1-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
22:59:56 171644 ERROR 2020-12-08 22:59:46.286 UTC [171668] LOG: listening on IPv4 address "0.0.0.0", port 5000
22:59:56 171644 ERROR 2020-12-08 22:59:46.286 UTC [171668] LOG: listening on IPv6 address "::", port 5000
22:59:56 171644 FATAL 2020-12-08 22:59:46.289 UTC [171668] FATAL: could not create lock file "/var/run/postgresql/.s.PGSQL.5000.lock": Permission denied
22:59:56 171644 ERROR 2020-12-08 22:59:46.292 UTC [171668] LOG: database system is shut down
22:59:56 171644 WARN Failed to start Postgres instance at "/home/kelliott/monitor"
22:59:56 171624 ERROR pg_autoctl service listener exited with exit status 12
22:59:56 171624 INFO Restarting service listener
22:59:56 171677 INFO /usr/lib/postgresql/13/bin/postgres -D /home/kelliott/monitor -p 5000 -h *
23:00:06 171674 ERROR Failed to open file "/home/kelliott/monitor/postmaster.pid": No such file or directory
23:00:06 171674 INFO Is PostgreSQL at "/home/kelliott/monitor" up and running?
23:00:06 171674 ERROR Failed to get Postgres pid, see above for details
23:00:06 171674 ERROR Failed to ensure that Postgres is running in "/home/kelliott/monitor"
23:00:06 171674 ERROR Failed to install pg_auto_failover in the monitor's Postgres database, see above for details
23:00:06 171644 ERROR Failed to open file "/home/kelliott/monitor/postmaster.pid": No such file or directory
23:00:06 171644 INFO Is PostgreSQL at "/home/kelliott/monitor" up and running?
23:00:06 171644 ERROR Failed to get Postgres pid, see above for details
23:00:06 171644 WARN Postgres logs from "/home/kelliott/monitor/startup.log":
23:00:06 171644 ERROR 2020-12-08 22:59:56.309 UTC [171677] LOG: redirecting log output to logging collector process
23:00:06 171644 ERROR 2020-12-08 22:59:56.309 UTC [171677] HINT: Future log output will appear in directory "log".
23:00:06 171644 WARN Postgres logs from "/home/kelliott/monitor/log/postgresql-2020-12-08_225956.log":
23:00:06 171644 ERROR 2020-12-08 22:59:56.309 UTC [171677] LOG: starting PostgreSQL 13.1 (Ubuntu 13.1-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
23:00:06 171644 ERROR 2020-12-08 22:59:56.309 UTC [171677] LOG: listening on IPv4 address "0.0.0.0", port 5000
23:00:06 171644 ERROR 2020-12-08 22:59:56.309 UTC [171677] LOG: listening on IPv6 address "::", port 5000
23:00:06 171644 FATAL 2020-12-08 22:59:56.312 UTC [171677] FATAL: could not create lock file "/var/run/postgresql/.s.PGSQL.5000.lock": Permission denied
23:00:06 171644 ERROR 2020-12-08 22:59:56.315 UTC [171677] LOG: database system is shut down
23:00:06 171644 WARN Failed to start Postgres instance at "/home/kelliott/monitor"
23:00:06 171624 ERROR pg_autoctl service listener exited with exit status 12
23:00:06 171624 INFO Restarting service listener
23:00:06 171687 INFO /usr/lib/postgresql/13/bin/postgres -D /home/kelliott/monitor -p 5000 -h *
23:00:16 171644 ERROR Failed to open file "/home/kelliott/monitor/postmaster.pid": No such file or directory
23:00:16 171644 INFO Is PostgreSQL at "/home/kelliott/monitor" up and running?
23:00:16 171644 ERROR Failed to get Postgres pid, see above for details
23:00:16 171644 WARN Postgres logs from "/home/kelliott/monitor/startup.log":
23:00:16 171644 ERROR 2020-12-08 23:00:06.337 UTC [171687] LOG: redirecting log output to logging collector process
23:00:16 171644 ERROR 2020-12-08 23:00:06.337 UTC [171687] HINT: Future log output will appear in directory "log".
23:00:16 171644 WARN Postgres logs from "/home/kelliott/monitor/log/postgresql-2020-12-08_230006.log":
23:00:16 171644 ERROR 2020-12-08 23:00:06.338 UTC [171687] LOG: starting PostgreSQL 13.1 (Ubuntu 13.1-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
23:00:16 171644 ERROR 2020-12-08 23:00:06.338 UTC [171687] LOG: listening on IPv4 address "0.0.0.0", port 5000
23:00:16 171644 ERROR 2020-12-08 23:00:06.338 UTC [171687] LOG: listening on IPv6 address "::", port 5000
23:00:16 171644 FATAL 2020-12-08 23:00:06.341 UTC [171687] FATAL: could not create lock file "/var/run/postgresql/.s.PGSQL.5000.lock": Permission denied
23:00:16 171644 ERROR 2020-12-08 23:00:06.344 UTC [171687] LOG: database system is shut down
23:00:16 171644 WARN Failed to start Postgres instance at "/home/kelliott/monitor"
23:00:16 171685 ERROR Failed to open file "/home/kelliott/monitor/postmaster.pid": No such file or directory
23:00:16 171685 INFO Is PostgreSQL at "/home/kelliott/monitor" up and running?
23:00:16 171685 ERROR Failed to get Postgres pid, see above for details
23:00:16 171685 ERROR Failed to ensure that Postgres is running in "/home/kelliott/monitor"
23:00:16 171685 ERROR Failed to install pg_auto_failover in the monitor's Postgres database, see above for details
23:00:16 171694 INFO /usr/lib/postgresql/13/bin/postgres -D /home/kelliott/monitor -p 5000 -h *
23:00:16 171624 ERROR pg_autoctl service listener exited with exit status 12
23:00:16 171624 INFO Restarting service listener
23:00:26 171644 ERROR Failed to open file "/home/kelliott/monitor/postmaster.pid": No such file or directory
23:00:26 171644 INFO Is PostgreSQL at "/home/kelliott/monitor" up and running?
23:00:26 171644 ERROR Failed to get Postgres pid, see above for details
23:00:26 171644 WARN Postgres logs from "/home/kelliott/monitor/startup.log":
23:00:26 171644 ERROR 2020-12-08 23:00:16.356 UTC [171694] LOG: redirecting log output to logging collector process
23:00:26 171644 ERROR 2020-12-08 23:00:16.356 UTC [171694] HINT: Future log output will appear in directory "log".
23:00:26 171644 WARN Postgres logs from "/home/kelliott/monitor/log/postgresql-2020-12-08_230016.log":
23:00:26 171644 ERROR 2020-12-08 23:00:16.356 UTC [171694] LOG: starting PostgreSQL 13.1 (Ubuntu 13.1-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
23:00:26 171644 ERROR 2020-12-08 23:00:16.356 UTC [171694] LOG: listening on IPv4 address "0.0.0.0", port 5000
23:00:26 171644 ERROR 2020-12-08 23:00:16.356 UTC [171694] LOG: listening on IPv6 address "::", port 5000
23:00:26 171644 FATAL 2020-12-08 23:00:16.360 UTC [171694] FATAL: could not create lock file "/var/run/postgresql/.s.PGSQL.5000.lock": Permission denied
23:00:26 171644 ERROR 2020-12-08 23:00:16.363 UTC [171694] LOG: database system is shut down
23:00:26 171644 WARN Failed to start Postgres instance at "/home/kelliott/monitor"
23:00:26 171695 ERROR Failed to open file "/home/kelliott/monitor/postmaster.pid": No such file or directory
23:00:26 171695 INFO Is PostgreSQL at "/home/kelliott/monitor" up and running?
23:00:26 171695 ERROR Failed to get Postgres pid, see above for details
23:00:26 171695 ERROR Failed to ensure that Postgres is running in "/home/kelliott/monitor"
23:00:26 171695 ERROR Failed to install pg_auto_failover in the monitor's Postgres database, see above for details
23:00:26 171704 INFO /usr/lib/postgresql/13/bin/postgres -D /home/kelliott/monitor -p 5000 -h *
23:00:26 171624 ERROR pg_autoctl service listener exited with exit status 12
23:00:26 171624 FATAL pg_autoctl service listener has already been restarted 5 times in the last 50 seconds, stopping now
23:00:26 171624 INFO Waiting for subprocesses to terminate.
23:00:31 171624 INFO pg_autoctl services are still running, signaling them with unknown signal.
23:00:36 171644 ERROR Failed to open file "/home/kelliott/monitor/postmaster.pid": No such file or directory
23:00:36 171644 INFO Is PostgreSQL at "/home/kelliott/monitor" up and running?
23:00:36 171644 ERROR Failed to get Postgres pid, see above for details
23:00:36 171644 WARN Postgres logs from "/home/kelliott/monitor/startup.log":
23:00:36 171644 ERROR 2020-12-08 23:00:26.276 UTC [171704] LOG: redirecting log output to logging collector process
23:00:36 171644 ERROR 2020-12-08 23:00:26.276 UTC [171704] HINT: Future log output will appear in directory "log".
23:00:36 171644 WARN Postgres logs from "/home/kelliott/monitor/log/postgresql-2020-12-08_230026.log":
23:00:36 171644 ERROR 2020-12-08 23:00:26.276 UTC [171704] LOG: starting PostgreSQL 13.1 (Ubuntu 13.1-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
23:00:36 171644 ERROR 2020-12-08 23:00:26.276 UTC [171704] LOG: listening on IPv4 address "0.0.0.0", port 5000
23:00:36 171644 ERROR 2020-12-08 23:00:26.276 UTC [171704] LOG: listening on IPv6 address "::", port 5000
23:00:36 171644 FATAL 2020-12-08 23:00:26.279 UTC [171704] FATAL: could not create lock file "/var/run/postgresql/.s.PGSQL.5000.lock": Permission denied
23:00:36 171644 ERROR 2020-12-08 23:00:26.282 UTC [171704] LOG: database system is shut down
23:00:36 171644 WARN Failed to start Postgres instance at "/home/kelliott/monitor"
23:00:36 171644 INFO Postgres controller service received signal SIGTERM, terminating
23:00:36 171624 FATAL Something went wrong in sub-process supervision, stopping now. See above for details.
23:00:36 171624 INFO Stop pg_autoctl
kelliott@af-db-controller:~$
And the contents of the monitor
dir:
kelliott@af-db-controller:~$ ls -l monitor/
total 144
drwx------ 5 kelliott kelliott 4096 Dec 8 22:59 base
-rw------- 1 kelliott kelliott 44 Dec 8 23:00 current_logfiles
drwx------ 2 kelliott kelliott 4096 Dec 8 22:59 global
drwx------ 2 kelliott kelliott 4096 Dec 8 23:00 log
drwx------ 2 kelliott kelliott 4096 Dec 8 22:59 pg_commit_ts
drwx------ 2 kelliott kelliott 4096 Dec 8 22:59 pg_dynshmem
-rw------- 1 kelliott kelliott 4760 Dec 8 22:59 pg_hba.conf
-rw------- 1 kelliott kelliott 1636 Dec 8 22:59 pg_ident.conf
drwx------ 4 kelliott kelliott 4096 Dec 8 22:59 pg_logical
drwx------ 4 kelliott kelliott 4096 Dec 8 22:59 pg_multixact
drwx------ 2 kelliott kelliott 4096 Dec 8 22:59 pg_notify
drwx------ 2 kelliott kelliott 4096 Dec 8 22:59 pg_replslot
drwx------ 2 kelliott kelliott 4096 Dec 8 22:59 pg_serial
drwx------ 2 kelliott kelliott 4096 Dec 8 22:59 pg_snapshots
drwx------ 2 kelliott kelliott 4096 Dec 8 22:59 pg_stat
drwx------ 2 kelliott kelliott 4096 Dec 8 22:59 pg_stat_tmp
drwx------ 2 kelliott kelliott 4096 Dec 8 22:59 pg_subtrans
drwx------ 2 kelliott kelliott 4096 Dec 8 22:59 pg_tblspc
drwx------ 2 kelliott kelliott 4096 Dec 8 22:59 pg_twophase
-rw------- 1 kelliott kelliott 3 Dec 8 22:59 PG_VERSION
drwx------ 3 kelliott kelliott 4096 Dec 8 22:59 pg_wal
drwx------ 2 kelliott kelliott 4096 Dec 8 22:59 pg_xact
-rw------- 1 kelliott kelliott 88 Dec 8 22:59 postgresql.auto.conf
-rw-r--r-- 1 kelliott kelliott 956 Dec 8 22:59 postgresql-auto-failover.conf
-rw------- 1 kelliott kelliott 28185 Dec 8 22:59 postgresql.conf
-rw-rw-r-- 1 kelliott kelliott 4149 Dec 8 22:59 server.crt
-rw------- 1 kelliott kelliott 1708 Dec 8 22:59 server.key
-rw-r--r-- 1 kelliott kelliott 189 Dec 8 23:00 startup.log
Thanks @DimCitus I will give that a try. Do you think future support will automatically detect this instead?
Yes. We're going to work on that. Having a default “just work” user experience on debian/ubuntu is a natural goal for this project.
22:59:36 171644 FATAL 2020-12-08 22:59:26.542 UTC [171650] FATAL: could not create lock file "/var/run/postgresql/.s.PGSQL.5000.lock": Permission denied
That's the main problem. Are you creating your Postgres instance as the debian postgres
user, or another user? If another user, did you add that user to the postgres
group? That's the debian packages way...
More context: we could of course detect that in pg_autoctl and then create the socket directory somewhere else, like in /tmp
per Postgres defaults when not using the debian packaging ; but then you need to use psql -h /tmp
because the libpq client applications on debian will also look for Unix sockets in /var/run/postgresql/
by default.
That fixed it. And so I decided to sudo su - postgres
and run all the steps rather than add the group to my personal user, and it worked like I have seen before on other systems. Hurray!
All is good now, the monitor and 2 nodes are running.
However, I just ran into another issue. node_2
had been promoted to primary since I took node_1
down at one point. Brought it back up and node_1
was successfully assigned secondary. Then, using pg_autoctl
on the monitor, I promoted node_1
to primary with pg_autoctl perform promotion --name node_1
.
Immediately there was an issue with node_2
as it went into an error loop, complaining about missing the node2
path. I have a second SSD (/dev/sdb1) with 1TB of space allocated mounted to /srv/db
and had put the node2
dir there, then symbolically linked /var/lib/postgresql/node2
to that. Apparently the demotion caused the source dir to disappear but the symbolic link still existed. I was able to simply copy the backup dir backups/node_2
to /srv/db/node2
and all was well there. Node 2 is successfully identified as secondary.
However, then the primary node node1
is in a wait_primary
state and errors with:
postgres@af-db-node1:~$ pg_autoctl create postgres --hostname 10.90.31.21 --auth trust --ssl-self-signed --pgctl /usr/lib/postgresql/13/bin/pg_ctl --monitor 'postgres://autoctl_node@10.90.31.20:5000/pg_auto_failover?sslmode=require' --run
20:10:25 13610 INFO Using default --ssl-mode "require"
20:10:25 13610 INFO Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic
20:10:25 13610 WARN Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.
20:10:25 13610 WARN See https://www.postgresql.org/docs/current/libpq-ssl.html for details
20:10:25 13610 INFO Started pg_autoctl postgres service with pid 13615
20:10:25 13615 INFO /usr/bin/pg_autoctl do service postgres --pgdata ./node1 -v
20:10:25 13610 INFO Started pg_autoctl node-active service with pid 13616
20:10:25 13616 INFO keeper has been successfully initialized.
20:10:25 13616 INFO /usr/bin/pg_autoctl do service node-active --pgdata ./node1 -v
20:10:25 13616 INFO Reloaded the new configuration from "/var/lib/postgresql/.config/pg_autoctl/var/lib/postgresql/node1/pg_autoctl.cfg"
20:10:25 13616 INFO pg_autoctl service is running, current state is "wait_primary"
20:10:25 13616 WARN Failed to update the keeper's state from the local PostgreSQL instance.
20:10:25 13616 INFO Fetched current list of 1 other nodes from the monitor to update HBA rules, including 1 changes.
20:10:25 13616 INFO Ensuring HBA rules for node 2 "node_2" (10.90.31.22:5002)
20:10:25 13616 INFO Monitor assigned new state "primary"
20:10:25 13625 INFO /usr/lib/postgresql/13/bin/postgres -D /srv/db/node1 -p 5001 -h *
20:10:26 13615 INFO Postgres is now serving PGDATA "/srv/db/node1" on port 5001 with pid 13625
20:10:26 13616 WARN PostgreSQL was not running, restarted with pid 13625
20:10:26 13616 INFO FSM transition from "wait_primary" to "primary": A healthy secondary appeared
20:10:26 13616 INFO Setting synchronous_standby_names to '*'
20:10:26 13616 WARN Failed to set the standby Target LSN because we don't have a quorum candidate yet
20:10:26 13616 ERROR Failed to transition from state "wait_primary" to state "primary", see above.
20:10:26 13616 ERROR Failed to transition to state "primary", retrying...
20:10:27 13616 INFO Updated the keeper's state from the local PostgreSQL instance, which is running
20:10:27 13616 INFO Monitor assigned new state "primary"
20:10:27 13616 INFO FSM transition from "wait_primary" to "primary": A healthy secondary appeared
20:10:27 13616 INFO Setting synchronous_standby_names to '*'
20:10:27 13616 WARN Failed to set the standby Target LSN because we don't have a quorum candidate yet
20:10:27 13616 ERROR Failed to transition from state "wait_primary" to state "primary", see above.
20:10:27 13616 ERROR Failed to transition to state "primary", retrying...
I would imagine it's due to the second disk and having a symbolic link as reference. What would the cleanest way to remedy this be?
Please check your streaming replication setup in node2, and then have a look at pg_stat_replication
on node1. It looks like your node2 is currently not connected to node1, or that something else is wrong with streaming replication. To debug that, you need to have a look at node2 logs for pg_autoctl and Postgres, and sometimes also Postgres logs from node1.
I'm not sure what's up, but on brand new Ubuntu 20.04 installations the install does not seem to work.
Installation of packages goes fine:
The version installed:
Then on to setting up the monitor:
But then the monitor doesn't seem set up: