Setting PGUSER / PGPASSWORD breaks clustering on v2 postgres!

Qqwy commented 1 year ago

Create a development single-node cluster using fly pg create
Run fly secrets set --app=db-app-name PGUSER="postgres" PGPASSWORD="passwordhere" with the password shown in the first step.
Scale to a second machine in the cluster using fly machines list to note the first machine's ID and then fly machines clone machineid.

The problematic state is now in effect. The new machine will not become healthy (however, at some point the fly machines clone command will terminate regardless).

fly logs --app=db-app-name will show the following:

2022-12-08T17:08:51Z app[148e466ce76189] ams [info]keeper   | pg_basebackup: error: connection to server at "fdaa:0:45b4:a7b:c207:ab67:1ee5:2", port 5433 failed: FATAL:  password authentication failed for user "repluser"
2022-12-08T17:08:51Z app[148e466ce76189] ams [info]keeper   | 2022-12-08T17:08:51.752Z  ERROR   cmd/keeper.go:1364  failed to resync from followed instance {"error": "sync error: exit status 1"}
2022-12-08T17:08:51Z app[6e8293ea64d487] ams [info]keeper   | 2022-12-08 17:08:51.751 UTC [3249] DETAIL:  Connection matched pg_hba.conf line 6: "host replication repluser ::0/0 md5"
2022-12-08T17:08:56Z app[148e466ce76189] ams [info]keeper   | 2022-12-08T17:08:56.857Z  ERROR   cmd/keeper.go:1109  db failed to initialize or resync
2022-12-08T17:08:56Z app[6e8293ea64d487] ams [info]keeper   | 2022-12-08 17:08:56.931 UTC [3291] FATAL:  password authentication failed for user "repluser"

This error message will repeat every few seconds.

And obviously, the new machine is not part of the new cluster as can be seen using stolonctl status:

fly ssh console --app=db-app-name --select
? Select VM: ams: 32873d9b5de985 fdaa:0:45b4:a7b:c207:31c0:d315:2 aged-violet-1627
Connecting to fdaa:0:45b4:a7b:c207:31c0:d315:2... complete
root@32873d9b5de985:/# stolonctl status
=== Active sentinels ===

ID      LEADER
5a6b6323    true
b6ddd7a1    false

=== Active proxies ===

No active proxies

=== Keepers ===

UID     HEALTHY PG LISTENADDRESS            PG HEALTHY  PG WANTEDGENERATION PG CURRENTGENERATION
23c2dae83ff82   true    fdaa:0:45b4:a7b:23c2:dae8:3ff8:2:5433   false       1           0
c20731c0d3152   true    fdaa:0:45b4:a7b:c207:31c0:d315:2:5433   true    3   3

=== Cluster Info ===

Master Keeper: c20731c0d3152

===== Keepers/DB tree =====

c20731c0d3152 (master)
└─23c2dae83ff82

Qqwy commented 1 year ago

The reason for configuring PGUSER/PGPASSWORD is of course to enable wal-g for backups. (c.f. #71 and this forum topic )

DAlperin commented 1 year ago

@Qqwy can you try this workaround for now? When setting PGUSER and PGPASSWORD for wal-g set PGUSER to repluser and PGPASSWORD to the result of running echo $REPL_PASSWORD on one of your database machines.

Qqwy commented 1 year ago

Thank you. This workaround seems to work 👍 .

Qqwy commented 1 year ago

Site note: It does mean that for the time being, calling wal-g requires manually overriding these values:

Like e.g. when manually performing a backup:

PGUSER=postgres PGPASSWORD=$OPERATOR_PASSWORD wal-g backup-push /data/postgres

I have not tested yet whether the full backup+restore flow with wal-g works OK with the workaround in place. Hopefully will have time to do so tomorrow.

Qqwy commented 1 year ago

I finally had the time to revisit this. Unfortunately, another error occurs during the initialization process of a restore later on.

The basis of a restore is a second cluster where OPERATOR_PASSWORD, SU_PASSWORD and REPL_PASSWORD are set to the same values as in the original cluster and also setting FLY_RESTORED_FROM to ensure stolonctl will apply these settings on cluster restart (otherwise you'll get authentication issues), the following is what you should runto trigger the restore as per stolon's docs

export $(cat /data/.env | xargs)
stolonctl init '{ "initMode": "pitr", "pitrConfig": {"dataRestoreCommand": "wal-g backup-fetch %d LATEST" , "archiveRecoverySettings": { "restoreCommand": "wal-g wal-fetch \"%f\" \"%p\"" } } }'

But this will not succeed. Instead, the keeper will error with:

FATAL:  could not start WAL streaming: ERROR:  replication slot "stolon_1723880d" is active for PID 649

Ful logs

``` 2023-01-25T16:03:19Z app[5683076c64248e] ams [info]keeper | waiting for server to shut down....2023-01-25 16:03:19.038 UTC [651] LOG: received fast shutdown request 2023-01-25T16:03:19Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:19.039 UTC [651] LOG: aborting any active transactions 2023-01-25T16:03:19Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:19.039 UTC [711] FATAL: terminating connection due to administrator command 2023-01-25T16:03:19Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:19.042 UTC [651] LOG: background worker "logical replication launcher" (PID 660) exited with exit code 1 2023-01-25T16:03:19Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:19.042 UTC [654] LOG: shutting down 2023-01-25T16:03:19Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:19.053 UTC [651] LOG: database system is shut down 2023-01-25T16:03:19Z app[5683076c64248e] ams [info]keeper | done 2023-01-25T16:03:19Z app[5683076c64248e] ams [info]keeper | server stopped 2023-01-25T16:03:19Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:19.193667 Selecting the latest backup... 2023-01-25T16:03:19Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:19.692009 LATEST backup is: 'base_000000010000000000000011' 2023-01-25T16:03:19Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:19.906784 AO files metadata was not found. Skipping the AO segments unpacking. 2023-01-25T16:03:19Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:19.962878 Finished extraction of part_003.tar.lz4 2023-01-25T16:03:20Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:20.615918 Finished extraction of part_001.tar.lz4 2023-01-25T16:03:20Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:20.644092 Finished extraction of pg_control.tar.lz4 2023-01-25T16:03:20Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:20.644229 2023-01-25T16:03:20Z app[5683076c64248e] ams [info]keeper | Backup extraction complete. 2023-01-25T16:03:20Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:20.688 UTC [2899] LOG: starting PostgreSQL 14.6 (Debian 14.6-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit 2023-01-25T16:03:20Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:20.688 UTC [2899] LOG: listening on IPv6 address "fdaa:0:45b4:a7b:23c8:7dd3:f475:2", port 5433 2023-01-25T16:03:20Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:20.690 UTC [2899] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433" 2023-01-25T16:03:20Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:20.692 UTC [2900] LOG: database system was interrupted while in recovery at log time 2023-01-25 14:21:10 UTC 2023-01-25T16:03:20Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:20.692 UTC [2900] HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. 2023-01-25T16:03:20Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:20.692 UTC [2900] LOG: creating missing WAL directory "pg_wal/archive_status" 2023-01-25T16:03:20Z app[5683076c64248e] ams [info]sentinel | 2023-01-25T16:03:20.801Z WARN cmd/sentinel.go:287 received db state for unexpected db uid {"receivedDB": "d253f8b8", "db": "90c1ed17", "keeper": "23c87dd3f4752"} 2023-01-25T16:03:21Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:21.251 UTC [2909] FATAL: the database system is starting up 2023-01-25T16:03:21Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:21.251Z ERROR cmd/keeper.go:719 cannot get configured pg parameters {"error": "pq: the database system is starting up"} 2023-01-25T16:03:21Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:21.321620 Archive '00000002.history' does not exist. 2023-01-25T16:03:21Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:21.323 UTC [2900] LOG: entering standby mode 2023-01-25T16:03:21Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:21.370 UTC [2918] FATAL: the database system is starting up 2023-01-25T16:03:21Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:21.931 UTC [2900] LOG: restored log file "000000010000000000000011" from archive 2023-01-25T16:03:21Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:21.941 UTC [2900] LOG: redo starts at 0/11000028 2023-01-25T16:03:21Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:21.941 UTC [2900] LOG: consistent recovery state reached at 0/11000100 2023-01-25T16:03:21Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:21.942 UTC [2899] LOG: database system is ready to accept read-only connections 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:22.350785 WAL-prefetch file: 000000010000000000000012 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:22.361375 WAL-prefetch file: 000000010000000000000013 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:22.372025 WAL-prefetch file: 000000010000000000000014 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:22.383078 WAL-prefetch file: 000000010000000000000015 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:22.393465 WAL-prefetch file: 000000010000000000000016 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:22.403899 WAL-prefetch file: 000000010000000000000017 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:22.414554 WAL-prefetch file: 000000010000000000000018 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:22.425435 WAL-prefetch file: 000000010000000000000019 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:22.435498 WAL-prefetch file: 00000001000000000000001A 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:22.446848 WAL-prefetch file: 00000001000000000000001B 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:22.525 UTC [2900] LOG: restored log file "000000010000000000000012" from archive 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:22.547953 Archive '000000010000000000000014' does not exist. 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:22.548332 Archive '000000010000000000000015' does not exist. 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:22.550749 Archive '000000010000000000000017' does not exist. 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:22.554112 Archive '000000010000000000000016' does not exist. 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:22.554881 Archive '000000010000000000000013' does not exist. 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:22.564293 Archive '000000010000000000000019' does not exist. 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:22.569117 Archive '00000001000000000000001B' does not exist. 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:22.579631 Archive '000000010000000000000018' does not exist. 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:22.579964 Archive '00000001000000000000001A' does not exist. 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:22.957149 WAL-prefetch file: 000000010000000000000013 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:22.967517 WAL-prefetch file: 000000010000000000000014 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:22.977624 WAL-prefetch file: 000000010000000000000015 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:22.988023 WAL-prefetch file: 000000010000000000000016 2023-01-25T16:03:22Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:22.999042 WAL-prefetch file: 000000010000000000000017 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:23.010185 WAL-prefetch file: 000000010000000000000018 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:23.020619 WAL-prefetch file: 000000010000000000000019 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:23.030686 WAL-prefetch file: 00000001000000000000001A 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:23.041314 WAL-prefetch file: 00000001000000000000001B 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:03:23.051664 WAL-prefetch file: 00000001000000000000001C 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:23.120065 Archive '000000010000000000000013' does not exist. 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:23.120491 Archive '000000010000000000000017' does not exist. 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:23.133325 Archive '000000010000000000000015' does not exist. 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:23.133604 Archive '000000010000000000000014' does not exist. 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:23.142447 Archive '000000010000000000000016' does not exist. 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:23.147363 Archive '00000001000000000000001A' does not exist. 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:23.153564 Archive '00000001000000000000001B' does not exist. 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:23.161895 Archive '00000001000000000000001C' does not exist. 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:23.169650 Archive '000000010000000000000019' does not exist. 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:23.187672 Archive '000000010000000000000018' does not exist. 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:23.271870 Archive '000000010000000000000013' does not exist. 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:23.288 UTC [2948] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:03:23Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:23.871153 Archive '00000002.history' does not exist. 2023-01-25T16:03:24Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:24.423471 Archive '000000010000000000000013' does not exist. 2023-01-25T16:03:24Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:24.442 UTC [2967] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:03:24Z app[5683076c64248e] ams [info]exporter | ERRO[0389] Error opening connection to database (postgresql://flypgadmin:PASSWORD_REMOVED@[fdaa:0:45b4:a7b:23c8:7dd3:f475:2]:5433/postgres?sslmode=disable): driver: bad connection source="postgres_exporter.go:1658" 2023-01-25T16:03:25Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:25.001641 Archive '00000002.history' does not exist. 2023-01-25T16:03:28Z app[5683076c64248e] ams [info]proxy | [WARNING] 024/160328 (597) : Backup Server bk_db/pg is DOWN, reason: Layer7 invalid response, info: "HTTP content check did not match", check duration: 7ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. 2023-01-25T16:03:28Z app[5683076c64248e] ams [info]proxy | [WARNING] 024/160328 (597) : Server bk_db/pg1 is DOWN, reason: Layer7 invalid response, info: "HTTP content check did not match", check duration: 9ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. 2023-01-25T16:03:28Z app[5683076c64248e] ams [info]proxy | [ALERT] 024/160328 (597) : backend 'bk_db' has no server available! 2023-01-25T16:03:29Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:29.492403 Archive '000000010000000000000013' does not exist. 2023-01-25T16:03:29Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:29.504 UTC [2997] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:03:30Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:30.077563 Archive '00000002.history' does not exist. 2023-01-25T16:03:34Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:34.476116 Archive '000000010000000000000013' does not exist. 2023-01-25T16:03:34Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:34.489 UTC [3027] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:03:35Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:35.076016 Archive '00000002.history' does not exist. 2023-01-25T16:03:38Z app[5683076c64248e] ams [info]exporter | INFO[0403] Established new database connection to "fdaa:0:45b4:a7b:23c8:7dd3:f475:2:5433". source="postgres_exporter.go:970" 2023-01-25T16:03:38Z app[5683076c64248e] ams [info]exporter | INFO[0403] Semantic Version Changed on "fdaa:0:45b4:a7b:23c8:7dd3:f475:2:5433": 0.0.0 -> 14.6.0 source="postgres_exporter.go:1539" 2023-01-25T16:03:38Z app[5683076c64248e] ams [info]exporter | INFO[0403] Established new database connection to "fdaa:0:45b4:a7b:23c8:7dd3:f475:2:5433". source="postgres_exporter.go:970" 2023-01-25T16:03:38Z app[5683076c64248e] ams [info]exporter | INFO[0403] Semantic Version Changed on "fdaa:0:45b4:a7b:23c8:7dd3:f475:2:5433": 0.0.0 -> 14.6.0 source="postgres_exporter.go:1539" 2023-01-25T16:03:39Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:39.434870 Archive '000000010000000000000013' does not exist. 2023-01-25T16:03:39Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:39.463 UTC [3063] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:03:40Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:40.033522 Archive '00000002.history' does not exist. 2023-01-25T16:03:44Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:44.475939 Archive '000000010000000000000013' does not exist. 2023-01-25T16:03:44Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:44.488 UTC [3096] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:03:45Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:45.067123 Archive '00000002.history' does not exist. 2023-01-25T16:03:49Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:49.431355 Archive '000000010000000000000013' does not exist. 2023-01-25T16:03:49Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:49.444 UTC [3130] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:03:50Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:50.090082 Archive '00000002.history' does not exist. 2023-01-25T16:03:54Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:54.535119 Archive '000000010000000000000013' does not exist. 2023-01-25T16:03:54Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:54.546 UTC [3161] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:03:55Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:55.140221 Archive '00000002.history' does not exist. 2023-01-25T16:03:59Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:03:59.566041 Archive '000000010000000000000013' does not exist. 2023-01-25T16:03:59Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:03:59.579 UTC [3198] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:04:00Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:00.190507 Archive '00000002.history' does not exist. 2023-01-25T16:04:04Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:04.560994 Archive '000000010000000000000013' does not exist. 2023-01-25T16:04:04Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:04.575 UTC [3228] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:04:05Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:05.171634 Archive '00000002.history' does not exist. 2023-01-25T16:04:09Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:09.489874 Archive '000000010000000000000013' does not exist. 2023-01-25T16:04:09Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:09.520 UTC [3262] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:04:10Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:10.095775 Archive '00000002.history' does not exist. 2023-01-25T16:04:14Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:14.488879 Archive '000000010000000000000013' does not exist. 2023-01-25T16:04:14Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:14.507 UTC [3295] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:04:15Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:15.111650 Archive '00000002.history' does not exist. 2023-01-25T16:04:19Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:19.483014 Archive '000000010000000000000013' does not exist. 2023-01-25T16:04:19Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:19.496 UTC [3332] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:04:20Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:20.083555 Archive '00000002.history' does not exist. 2023-01-25T16:04:24Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:24.504949 Archive '000000010000000000000013' does not exist. 2023-01-25T16:04:24Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:24.516 UTC [3362] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:04:25Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:25.090535 Archive '00000002.history' does not exist. 2023-01-25T16:04:27Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:04:27.938Z ERROR cmd/keeper.go:1506 database cluster not initialized but requested role is master. This shouldn't happen! 2023-01-25T16:04:29Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:29.536742 Archive '000000010000000000000013' does not exist. 2023-01-25T16:04:29Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:29.549 UTC [3402] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:04:30Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:30.133647 Archive '00000002.history' does not exist. 2023-01-25T16:04:32Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:04:32.950Z ERROR cmd/keeper.go:1109 db failed to initialize or resync 2023-01-25T16:04:32Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:32.990659 Selecting the latest backup... 2023-01-25T16:04:33Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:33.455259 LATEST backup is: 'base_000000010000000000000011' 2023-01-25T16:04:33Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:33.625594 AO files metadata was not found. Skipping the AO segments unpacking. 2023-01-25T16:04:33Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:33.649567 Finished extraction of part_003.tar.lz4 2023-01-25T16:04:34Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:34.353534 Finished extraction of part_001.tar.lz4 2023-01-25T16:04:34Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:34.371782 Finished extraction of pg_control.tar.lz4 2023-01-25T16:04:34Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:34.371878 2023-01-25T16:04:34Z app[5683076c64248e] ams [info]keeper | Backup extraction complete. 2023-01-25T16:04:34Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:34.439 UTC [3443] LOG: starting PostgreSQL 14.6 (Debian 14.6-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit 2023-01-25T16:04:34Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:34.439 UTC [3443] LOG: could not bind IPv6 address "fdaa:0:45b4:a7b:23c8:7dd3:f475:2": Address already in use 2023-01-25T16:04:34Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:34.439 UTC [3443] HINT: Is another postmaster already running on port 5433? If not, wait a few seconds and retry. 2023-01-25T16:04:34Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:34.439 UTC [3443] WARNING: could not create listen socket for "fdaa:0:45b4:a7b:23c8:7dd3:f475:2" 2023-01-25T16:04:34Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:34.439 UTC [3443] FATAL: could not create any TCP/IP sockets 2023-01-25T16:04:34Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:34.443 UTC [3443] LOG: database system is shut down 2023-01-25T16:04:34Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:34.545374 Archive '000000010000000000000013' does not exist. 2023-01-25T16:04:34Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:34.562 UTC [3444] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:04:34Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:04:34.589Z ERROR cmd/keeper.go:1258 failed to start instance {"error": "postgres exited unexpectedly"} 2023-01-25T16:04:35Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:35.130687 Archive '00000002.history' does not exist. 2023-01-25T16:04:35Z app[5683076c64248e] ams [info]sentinel | 2023-01-25T16:04:35.644Z ERROR cmd/sentinel.go:1939 failed to update cluster data {"error": "cannot choose initial master: no keepers registered"} 2023-01-25T16:04:39Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:39.515618 Archive '000000010000000000000013' does not exist. 2023-01-25T16:04:39Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:39.528 UTC [3478] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:04:40Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:40.108456 Archive '00000002.history' does not exist. 2023-01-25T16:04:40Z app[5683076c64248e] ams [info]sentinel | 2023-01-25T16:04:40.909Z ERROR cmd/sentinel.go:1939 failed to update cluster data {"error": "cannot choose initial master: no keepers registered"} 2023-01-25T16:04:44Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:44.499745 Archive '000000010000000000000013' does not exist. 2023-01-25T16:04:44Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:44.515 UTC [3511] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:04:45Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:45.094840 Archive '00000002.history' does not exist. 2023-01-25T16:04:49Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:49.505755 Archive '000000010000000000000013' does not exist. 2023-01-25T16:04:49Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:49.517 UTC [3545] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:04:49Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:04:49.715Z ERROR cmd/keeper.go:1109 db failed to initialize or resync 2023-01-25T16:04:49Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:49.803974 Selecting the latest backup... 2023-01-25T16:04:50Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:50.087285 Archive '00000002.history' does not exist. 2023-01-25T16:04:50Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:50.279906 LATEST backup is: 'base_000000010000000000000011' 2023-01-25T16:04:50Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:50.439055 AO files metadata was not found. Skipping the AO segments unpacking. 2023-01-25T16:04:50Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:50.463327 Finished extraction of part_003.tar.lz4 2023-01-25T16:04:51Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:51.107028 Finished extraction of part_001.tar.lz4 2023-01-25T16:04:51Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:51.129659 Finished extraction of pg_control.tar.lz4 2023-01-25T16:04:51Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:51.129786 2023-01-25T16:04:51Z app[5683076c64248e] ams [info]keeper | Backup extraction complete. 2023-01-25T16:04:51Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:51.174 UTC [3570] LOG: starting PostgreSQL 14.6 (Debian 14.6-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit 2023-01-25T16:04:51Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:51.175 UTC [3570] LOG: could not bind IPv6 address "fdaa:0:45b4:a7b:23c8:7dd3:f475:2": Address already in use 2023-01-25T16:04:51Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:51.175 UTC [3570] HINT: Is another postmaster already running on port 5433? If not, wait a few seconds and retry. 2023-01-25T16:04:51Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:51.175 UTC [3570] WARNING: could not create listen socket for "fdaa:0:45b4:a7b:23c8:7dd3:f475:2" 2023-01-25T16:04:51Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:51.175 UTC [3570] FATAL: could not create any TCP/IP sockets 2023-01-25T16:04:51Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:51.180 UTC [3570] LOG: database system is shut down 2023-01-25T16:04:51Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:04:51.342Z ERROR cmd/keeper.go:1258 failed to start instance {"error": "postgres exited unexpectedly"} 2023-01-25T16:04:51Z app[5683076c64248e] ams [info]sentinel | 2023-01-25T16:04:51.472Z WARN cmd/sentinel.go:287 received db state for unexpected db uid {"receivedDB": "90c1ed17", "db": "aaf1c095", "keeper": "23c87dd3f4752"} 2023-01-25T16:04:54Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:54.513472 Archive '000000010000000000000013' does not exist. 2023-01-25T16:04:54Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:54.529 UTC [3589] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:04:55Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:55.110797 Archive '00000002.history' does not exist. 2023-01-25T16:04:56Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:04:56.364Z ERROR cmd/keeper.go:1109 db failed to initialize or resync 2023-01-25T16:04:56Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:56.454859 Selecting the latest backup... 2023-01-25T16:04:56Z app[5683076c64248e] ams [info]sentinel | 2023-01-25T16:04:56.839Z WARN cmd/sentinel.go:287 received db state for unexpected db uid {"receivedDB": "90c1ed17", "db": "aaf1c095", "keeper": "23c87dd3f4752"} 2023-01-25T16:04:56Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:56.949833 LATEST backup is: 'base_000000010000000000000011' 2023-01-25T16:04:57Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:57.073700 AO files metadata was not found. Skipping the AO segments unpacking. 2023-01-25T16:04:57Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:57.097138 Finished extraction of part_003.tar.lz4 2023-01-25T16:04:57Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:57.628640 Finished extraction of part_001.tar.lz4 2023-01-25T16:04:57Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:57.648468 Finished extraction of pg_control.tar.lz4 2023-01-25T16:04:57Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:04:57.648498 2023-01-25T16:04:57Z app[5683076c64248e] ams [info]keeper | Backup extraction complete. 2023-01-25T16:04:57Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:57.689 UTC [3622] LOG: starting PostgreSQL 14.6 (Debian 14.6-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit 2023-01-25T16:04:57Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:57.689 UTC [3622] LOG: could not bind IPv6 address "fdaa:0:45b4:a7b:23c8:7dd3:f475:2": Address already in use 2023-01-25T16:04:57Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:57.689 UTC [3622] HINT: Is another postmaster already running on port 5433? If not, wait a few seconds and retry. 2023-01-25T16:04:57Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:57.689 UTC [3622] WARNING: could not create listen socket for "fdaa:0:45b4:a7b:23c8:7dd3:f475:2" 2023-01-25T16:04:57Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:57.689 UTC [3622] FATAL: could not create any TCP/IP sockets 2023-01-25T16:04:57Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:57.694 UTC [3622] LOG: database system is shut down 2023-01-25T16:04:57Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:04:57.862Z ERROR cmd/keeper.go:1258 failed to start instance {"error": "postgres exited unexpectedly"} 2023-01-25T16:04:59Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:04:59.517739 Archive '000000010000000000000013' does not exist. 2023-01-25T16:04:59Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:04:59.530 UTC [3637] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:05:00Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:05:00.112102 Archive '00000002.history' does not exist. 2023-01-25T16:05:02Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:05:02.897Z ERROR cmd/keeper.go:1109 db failed to initialize or resync 2023-01-25T16:05:02Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:02.966262 Selecting the latest backup... 2023-01-25T16:05:03Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:03.440917 LATEST backup is: 'base_000000010000000000000011' 2023-01-25T16:05:04Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:04.200744 AO files metadata was not found. Skipping the AO segments unpacking. 2023-01-25T16:05:04Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:04.240422 Finished extraction of part_003.tar.lz4 2023-01-25T16:05:04Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:05:04.523436 Archive '000000010000000000000013' does not exist. 2023-01-25T16:05:04Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:04.537 UTC [3676] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:05:04Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:04.853012 Finished extraction of part_001.tar.lz4 2023-01-25T16:05:04Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:04.912737 Finished extraction of pg_control.tar.lz4 2023-01-25T16:05:04Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:04.912769 2023-01-25T16:05:04Z app[5683076c64248e] ams [info]keeper | Backup extraction complete. 2023-01-25T16:05:04Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:04.966 UTC [3686] LOG: starting PostgreSQL 14.6 (Debian 14.6-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit 2023-01-25T16:05:04Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:04.966 UTC [3686] LOG: could not bind IPv6 address "fdaa:0:45b4:a7b:23c8:7dd3:f475:2": Address already in use 2023-01-25T16:05:04Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:04.966 UTC [3686] HINT: Is another postmaster already running on port 5433? If not, wait a few seconds and retry. 2023-01-25T16:05:04Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:04.966 UTC [3686] WARNING: could not create listen socket for "fdaa:0:45b4:a7b:23c8:7dd3:f475:2" 2023-01-25T16:05:04Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:04.966 UTC [3686] FATAL: could not create any TCP/IP sockets 2023-01-25T16:05:04Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:04.970 UTC [3686] LOG: database system is shut down 2023-01-25T16:05:05Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:05:05.122972 Archive '00000002.history' does not exist. 2023-01-25T16:05:05Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:05:05.126Z ERROR cmd/keeper.go:1258 failed to start instance {"error": "postgres exited unexpectedly"} 2023-01-25T16:05:09Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:05:09.571929 Archive '000000010000000000000013' does not exist. 2023-01-25T16:05:09Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:09.587 UTC [3713] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:05:10Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:05:10.136527 Archive '00000002.history' does not exist. 2023-01-25T16:05:10Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:05:10.181Z ERROR cmd/keeper.go:1109 db failed to initialize or resync 2023-01-25T16:05:10Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:10.256793 Selecting the latest backup... 2023-01-25T16:05:10Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:10.727394 LATEST backup is: 'base_000000010000000000000011' 2023-01-25T16:05:10Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:10.858197 AO files metadata was not found. Skipping the AO segments unpacking. 2023-01-25T16:05:10Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:10.884414 Finished extraction of part_003.tar.lz4 2023-01-25T16:05:11Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:11.520888 Finished extraction of part_001.tar.lz4 2023-01-25T16:05:11Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:11.541000 Finished extraction of pg_control.tar.lz4 2023-01-25T16:05:11Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:11.541021 2023-01-25T16:05:11Z app[5683076c64248e] ams [info]keeper | Backup extraction complete. 2023-01-25T16:05:11Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:11.590 UTC [3740] LOG: starting PostgreSQL 14.6 (Debian 14.6-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit 2023-01-25T16:05:11Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:11.590 UTC [3740] LOG: could not bind IPv6 address "fdaa:0:45b4:a7b:23c8:7dd3:f475:2": Address already in use 2023-01-25T16:05:11Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:11.590 UTC [3740] HINT: Is another postmaster already running on port 5433? If not, wait a few seconds and retry. 2023-01-25T16:05:11Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:11.590 UTC [3740] WARNING: could not create listen socket for "fdaa:0:45b4:a7b:23c8:7dd3:f475:2" 2023-01-25T16:05:11Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:11.590 UTC [3740] FATAL: could not create any TCP/IP sockets 2023-01-25T16:05:11Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:11.596 UTC [3740] LOG: database system is shut down 2023-01-25T16:05:11Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:05:11.754Z ERROR cmd/keeper.go:1258 failed to start instance {"error": "postgres exited unexpectedly"} 2023-01-25T16:05:14Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:05:14.528757 Archive '000000010000000000000013' does not exist. 2023-01-25T16:05:14Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:14.543 UTC [3756] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:05:15Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:05:15.104948 Archive '00000002.history' does not exist. 2023-01-25T16:05:16Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:05:16.769Z ERROR cmd/keeper.go:1109 db failed to initialize or resync 2023-01-25T16:05:16Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:16.855385 Selecting the latest backup... 2023-01-25T16:05:17Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:17.296092 LATEST backup is: 'base_000000010000000000000011' 2023-01-25T16:05:17Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:17.421906 AO files metadata was not found. Skipping the AO segments unpacking. 2023-01-25T16:05:17Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:17.447735 Finished extraction of part_003.tar.lz4 2023-01-25T16:05:17Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:17.995974 Finished extraction of part_001.tar.lz4 2023-01-25T16:05:18Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:18.018426 Finished extraction of pg_control.tar.lz4 2023-01-25T16:05:18Z app[5683076c64248e] ams [info]keeper | INFO: 2023/01/25 16:05:18.018582 2023-01-25T16:05:18Z app[5683076c64248e] ams [info]keeper | Backup extraction complete. 2023-01-25T16:05:18Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:18.055 UTC [3790] LOG: starting PostgreSQL 14.6 (Debian 14.6-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit 2023-01-25T16:05:18Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:18.055 UTC [3790] LOG: could not bind IPv6 address "fdaa:0:45b4:a7b:23c8:7dd3:f475:2": Address already in use 2023-01-25T16:05:18Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:18.055 UTC [3790] HINT: Is another postmaster already running on port 5433? If not, wait a few seconds and retry. 2023-01-25T16:05:18Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:18.055 UTC [3790] WARNING: could not create listen socket for "fdaa:0:45b4:a7b:23c8:7dd3:f475:2" 2023-01-25T16:05:18Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:18.055 UTC [3790] FATAL: could not create any TCP/IP sockets 2023-01-25T16:05:18Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:18.060 UTC [3790] LOG: database system is shut down 2023-01-25T16:05:18Z app[5683076c64248e] ams [info]keeper | 2023-01-25T16:05:18.230Z ERROR cmd/keeper.go:1258 failed to start instance {"error": "postgres exited unexpectedly"} 2023-01-25T16:05:19Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:05:19.502210 Archive '000000010000000000000013' does not exist. 2023-01-25T16:05:19Z app[5683076c64248e] ams [info]keeper | 2023-01-25 16:05:19.516 UTC [3804] FATAL: could not start WAL streaming: ERROR: replication slot "stolon_1723880d" is active for PID 649 2023-01-25T16:05:20Z app[5683076c64248e] ams [info]keeper | ERROR: 2023/01/25 16:05:20.097358 Archive '00000002.history' does not exist. 2023-01-25T16:05:21Z app[5683076c64248e] ams [info]keeper | Process exited 0 2023-01-25T16:05:21Z app[5683076c64248e] ams [info]keeper | restarting in 5s [attempt 1] ```

Qqwy commented 1 year ago

Actually, above issue was caused by something else: You need to make sure you do not enable wal-g on the restore until after you've restored. Makes sense in hindsight.

I'll close this issue now 👌 .

fly-apps / postgres-ha

Setting PGUSER / PGPASSWORD breaks clustering on v2 postgres! #94