Closed tirelibirefe closed 1 year ago
I was able to complete a restore from an encrypted bucket in s3. Two questions:
(1) After the restore pod finished, did you check the new postgres pod to see what was there? For comparison, here's the logs of my recent restore job:
+ pgbackrest restore --stanza=db --pg1-path=/pgdata/pg14 --repo=1 --delta --link-map=pg_wal=/pgdata/pg14_wal
WARN: unable to open log file '/pgdata/pgbackrest/log/db-restore.log': No such file or directory
NOTE: process will continue without log file.
WARN: --delta or --force specified but unable to find 'PG_VERSION' or 'backup.manifest' in '/pgdata/pg14' to confirm that this is a valid $PGDATA directory. --delta and --force have been disabled and if any files exist in the destination directories the restore will be aborted.
2022-10-12 21:16:14.310 GMT [18] LOG: starting PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10), 64-bit
2022-10-12 21:16:14.310 GMT [18] LOG: listening on IPv6 address "::1", port 5432
2022-10-12 21:16:14.310 GMT [18] LOG: listening on IPv4 address "127.0.0.1", port 5432
2022-10-12 21:16:14.315 GMT [18] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-10-12 21:16:14.320 GMT [19] LOG: database system was interrupted; last known up at 2022-10-12 21:06:08 GMT
2022-10-12 21:16:14.833 GMT [19] LOG: starting archive recovery
2022-10-12 21:16:15.608 GMT [19] LOG: restored log file "000000010000000000000007" from archive
2022-10-12 21:16:15.652 GMT [19] LOG: redo starts at 0/7000028
2022-10-12 21:16:16.257 GMT [19] LOG: restored log file "000000010000000000000008" from archive
2022-10-12 21:16:16.275 GMT [19] LOG: consistent recovery state reached at 0/8000050
2022-10-12 21:16:16.275 GMT [18] LOG: database system is ready to accept read-only connections
2022-10-12 21:16:16.723 GMT [19] LOG: restored log file "000000010000000000000009" from archive
2022-10-12 21:16:17.051 GMT [19] LOG: redo done at 0/9000148 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 1.39 s
2022-10-12 21:16:17.051 GMT [19] LOG: last completed transaction was at log time 2022-10-12 21:06:17.63812+00
2022-10-12 21:16:17.509 GMT [19] LOG: restored log file "000000010000000000000009" from archive
2022-10-12 21:16:17.859 GMT [19] LOG: selected new timeline ID: 2
2022-10-12 21:16:17.897 GMT [19] LOG: archive recovery complete
2022-10-12 21:16:18.203 GMT [18] LOG: database system is ready to accept connections
2022-10-12 21:16:18.206 GMT [41] LOG: archive command failed with exit code 1
2022-10-12 21:16:18.206 GMT [41] DETAIL: The failed archive command was: false
2022-10-12 21:16:19.212 GMT [41] LOG: archive command failed with exit code 1
2022-10-12 21:16:19.212 GMT [41] DETAIL: The failed archive command was: false
2022-10-12 21:16:19.350 GMT [18] LOG: received fast shutdown request
2022-10-12 21:16:19.353 GMT [18] LOG: aborting any active transactions
2022-10-12 21:16:19.355 GMT [18] LOG: background worker "logical replication launcher" (PID 42) exited with exit code 1
2022-10-12 21:16:19.358 GMT [22] LOG: shutting down
2022-10-12 21:16:19.402 GMT [41] LOG: archive command failed with exit code 1
2022-10-12 21:16:19.402 GMT [41] DETAIL: The failed archive command was: false
2022-10-12 21:16:19.403 GMT [41] WARNING: archiving write-ahead log file "00000002.history" failed too many times, will try again later
2022-10-12 21:16:19.405 GMT [41] LOG: archive command failed with exit code 1
2022-10-12 21:16:19.405 GMT [41] DETAIL: The failed archive command was: false
2022-10-12 21:16:20.412 GMT [41] LOG: archive command failed with exit code 1
2022-10-12 21:16:20.412 GMT [41] DETAIL: The failed archive command was: false
2022-10-12 21:16:21.382 GMT [41] LOG: archive command failed with exit code 1
2022-10-12 21:16:21.382 GMT [41] DETAIL: The failed archive command was: false
2022-10-12 21:16:21.382 GMT [41] WARNING: archiving write-ahead log file "00000002.history" failed too many times, will try again later
2022-10-12 21:16:21.387 GMT [18] LOG: database system is shut down
Looks pretty similar, and when I go to check the postgres pod, the restore has completed successfully. So: did the postgres pod come up successfully with the data that you expected?
(2) Why does your configuration have two secrets and what do they look like? (I mean, minus the actual secret data -- just what are the fields/volumes getting mounted to the restore pod?
Also, just to be clear, are you using the pgbackrest encryption as discussed here: https://access.crunchydata.com/documentation/postgres-operator/5.1.1/tutorial/backups/#encryption
Hello @benjaminjb, This is a very important topic and I'll give more information about the topic, pls don't close it for a while.
Thanks & Regards
Also, just to be clear, are you using the pgbackrest encryption as discussed here: https://access.crunchydata.com/documentation/postgres-operator/5.1.1/tutorial/backups/#encryption
Hello @benjaminjb Pls excuse me for the latency. As I cleaned up the environment, I had to reinstall stuff, to reproduce the issue; dbnames may have been changed but I use the same manifests.
##########
All my backups are encrypted, all my backups are written to S3.
###########
...and my answer is yes. I used the receipt there.
$ k get secrets -n psql-ferahevler
NAME TYPE DATA AGE
default-token-n82m6 kubernetes.io/service-account-token 3 44m
ferahevler-instance-token-64hbg kubernetes.io/service-account-token 3 40m
ferahevler-monitoring Opaque 2 40m
ferahevler-pgbackrest-secrets Opaque 1 40m
ferahevler-pgbackrest-token-xhth8 kubernetes.io/service-account-token 3 40m
ferahevler-pgbouncer Opaque 6 40m
ferahevler-pgha1-bgqk-certs Opaque 4 40m
ferahevler-pgha1-xgxz-certs Opaque 4 40m
ferahevler-pguser-ferahevler Opaque 12 40m
ferahevler-pguser-postgres Opaque 7 40m
ferahevler-repl-tls kubernetes.io/tls 3 40m
ferahevler-tls kubernetes.io/tls 3 40m
pgo-root-cacert Opaque 2 40m
psql-ferahevler-s3-creds Opaque 1 40m
and here is the full config:
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: ferahevler
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.5-1
postgresVersion: 14
customReplicationTLSSecret:
name: ferahevler-repl-tls
customTLSSecret:
name: ferahevler-tls
users:
- name: postgres
- name: ferahevler
databases:
- ferahevler
instances:
- name: pgha1
replicas: 2
dataVolumeClaimSpec:
storageClassName: ebs-sc-enc
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: ferahevler
postgres-operator.crunchydata.com/instance-set: pgha1
monitoring:
pgmonitor:
exporter:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi8-5.2.0-0
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.40-1
configuration:
- secret:
name: psql-ferahevler-s3-creds
- secret:
name: ferahevler-pgbackrest-secrets
manual:
repoName: repo1
options:
- --type=full
global:
repo1-path: /repo1/ferahevler
repo1-retention-full: "30"
repo1-retention-full-type: time
repo1-cipher-type: aes-256-cbc
repos:
- name: repo1
schedules:
full: "0 */5 * * *"
# incremental: "0 */6 * * *"
s3:
bucket: mybucketnamehere
endpoint: s3.eu-central-1.amazonaws.com:443
region: eu-central-1
proxy:
pgBouncer:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:ubi8-1.17-1
replicas: 2
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: ferahevler
postgres-operator.crunchydata.com/role: pgbouncer
Scenario-1: psql-restore-ferahevler-enc-v1
I deleted the psql cluster (ferahevler) above and try to restore to a new cluster (different name cluster). New clustername is seyrantepe
.
here my backups:
All my backups are encrypted, all my backups are written to S3.
I'll try to restore from a specific backup: (20221024-105247F)
here is the config:
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: seyrantepe
namespace: psql-seyrantepe
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.5-1
postgresVersion: 14
customReplicationTLSSecret:
name: seyrantepe-repl-tls
customTLSSecret:
name: seyrantepe-tls
dataSource:
pgbackrest:
stanza: db
configuration:
- secret:
name: psql-seyrantepe-s3-creds
- secret:
name: seyrantepe-pgbackrest-secrets
global:
repo1-path: /repo1/ferahevler
repo1-cipher-type: aes-256-cbc
repo:
name: repo1
s3:
bucket: ku-eksdev1-crunchydata-backups
endpoint: s3.eu-central-1.amazonaws.com:443
region: eu-central-1
users:
- name: postgres
- name: seyrantepe
databases:
- seyrantepe
instances:
- name: pgha1
replicas: 2
dataVolumeClaimSpec:
storageClassName: ebs-sc-enc
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: seyrantepe
postgres-operator.crunchydata.com/instance-set: pgha1
monitoring:
pgmonitor:
exporter:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi8-5.2.0-0
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.40-1
configuration:
- secret:
name: psql-seyrantepe-s3-creds
- secret:
name: seyrantepe-pgbackrest-secrets
manual:
repoName: repo1
options:
- --type=full
global:
repo1-path: /repo1/ferahevler
repo1-retention-full: "30"
repo1-retention-full-type: time
repo1-cipher-type: aes-256-cbc
repos:
- name: repo1
schedules:
full: "0 */5 * * *"
# incremental: "0 */6 * * *"
s3:
bucket: ku-eksdev1-crunchydata-backups
endpoint: s3.eu-central-1.amazonaws.com:443
region: eu-central-1
proxy:
pgBouncer:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:ubi8-1.17-1
replicas: 2
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: seyrantepe
postgres-operator.crunchydata.com/role: pgbouncer
...and logs:
$ k logs -n psql-seyrantepe seyrantepe-pgbackrest-restore-zq7c4 -f
Defaulted container "pgbackrest-restore" out of: pgbackrest-restore, nss-wrapper-init (init)
+ pgbackrest restore --stanza=db --pg1-path=/pgdata/pg14 --repo=1 --delta --link-map=pg_wal=/pgdata/pg14_wal
WARN: unable to open log file '/pgdata/pgbackrest/log/db-restore.log': No such file or directory
NOTE: process will continue without log file.
WARN: --delta or --force specified but unable to find 'PG_VERSION' or 'backup.manifest' in '/pgdata/pg14' to confirm that this is a valid $PGDATA directory. --delta and --force have been disabled and if any files exist in the destination directories the restore will be aborted.
2022-10-24 17:33:13.540 GMT [19] LOG: starting PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10), 64-bit
2022-10-24 17:33:13.541 GMT [19] LOG: listening on IPv6 address "::1", port 5432
2022-10-24 17:33:13.541 GMT [19] LOG: listening on IPv4 address "127.0.0.1", port 5432
2022-10-24 17:33:13.548 GMT [19] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-10-24 17:33:13.558 GMT [20] LOG: database system was interrupted; last known up at 2022-10-24 15:00:10 GMT
2022-10-24 17:33:13.717 GMT [20] LOG: starting archive recovery
2022-10-24 17:33:13.918 GMT [20] LOG: restored log file "000000010000000000000045" from archive
2022-10-24 17:33:13.987 GMT [20] LOG: redo starts at 0/45000028
2022-10-24 17:33:14.153 GMT [20] LOG: restored log file "000000010000000000000046" from archive
2022-10-24 17:33:14.254 GMT [20] LOG: consistent recovery state reached at 0/46000088
2022-10-24 17:33:14.255 GMT [19] LOG: database system is ready to accept read-only connections
2022-10-24 17:33:14.466 GMT [20] LOG: restored log file "000000010000000000000047" from archive
2022-10-24 17:33:14.722 GMT [20] LOG: restored log file "000000010000000000000048" from archive
2022-10-24 17:33:14.990 GMT [20] LOG: restored log file "000000010000000000000049" from archive
2022-10-24 17:33:15.252 GMT [20] LOG: restored log file "00000001000000000000004A" from archive
2022-10-24 17:33:15.765 GMT [20] LOG: restored log file "00000001000000000000004B" from archive
2022-10-24 17:33:20.419 GMT [20] LOG: restored log file "00000001000000000000004C" from archive
2022-10-24 17:33:20.928 GMT [20] LOG: restored log file "00000001000000000000004D" from archive
2022-10-24 17:33:21.155 GMT [20] LOG: restored log file "00000001000000000000004E" from archive
2022-10-24 17:33:21.475 GMT [20] LOG: restored log file "00000001000000000000004F" from archive
2022-10-24 17:33:21.899 GMT [20] LOG: restored log file "000000010000000000000050" from archive
2022-10-24 17:33:22.183 GMT [20] LOG: restored log file "000000010000000000000051" from archive
2022-10-24 17:33:22.381 GMT [20] LOG: restored log file "000000010000000000000052" from archive
2022-10-24 17:33:22.644 GMT [20] LOG: restored log file "000000010000000000000053" from archive
2022-10-24 17:33:22.904 GMT [20] LOG: restored log file "000000010000000000000054" from archive
2022-10-24 17:33:23.190 GMT [20] LOG: restored log file "000000010000000000000055" from archive
2022-10-24 17:33:23.492 GMT [20] LOG: restored log file "000000010000000000000056" from archive
2022-10-24 17:33:23.827 GMT [20] LOG: restored log file "000000010000000000000057" from archive
2022-10-24 17:33:25.716 GMT [20] LOG: restored log file "000000010000000000000058" from archive
2022-10-24 17:33:29.021 GMT [20] LOG: restored log file "000000010000000000000059" from archive
2022-10-24 17:33:30.943 GMT [20] LOG: restored log file "00000001000000000000005A" from archive
2022-10-24 17:33:31.244 GMT [20] LOG: restored log file "00000001000000000000005B" from archive
2022-10-24 17:33:31.611 GMT [20] LOG: restored log file "00000001000000000000005C" from archive
2022-10-24 17:33:32.605 GMT [20] LOG: restored log file "00000001000000000000005D" from archive
2022-10-24 17:33:32.818 GMT [20] LOG: restored log file "00000001000000000000005E" from archive
2022-10-24 17:33:33.299 GMT [20] LOG: restored log file "00000001000000000000005F" from archive
2022-10-24 17:33:33.679 GMT [20] LOG: restored log file "000000010000000000000060" from archive
2022-10-24 17:33:34.011 GMT [20] LOG: restored log file "000000010000000000000061" from archive
2022-10-24 17:33:34.376 GMT [20] LOG: restored log file "000000010000000000000062" from archive
2022-10-24 17:33:34.645 GMT [20] LOG: restored log file "000000010000000000000063" from archive
2022-10-24 17:33:34.898 GMT [20] LOG: restored log file "000000010000000000000064" from archive
2022-10-24 17:33:35.165 GMT [20] LOG: restored log file "000000010000000000000065" from archive
2022-10-24 17:33:35.449 GMT [20] LOG: restored log file "000000010000000000000066" from archive
2022-10-24 17:33:35.718 GMT [20] LOG: restored log file "000000010000000000000067" from archive
2022-10-24 17:33:35.897 GMT [20] LOG: redo done at 0/67000110 system usage: CPU: user: 0.00 s, system: 0.11 s, elapsed: 21.91 s
2022-10-24 17:33:35.897 GMT [20] LOG: last completed transaction was at log time 2022-10-24 17:23:30.336398+00
2022-10-24 17:33:36.389 GMT [20] LOG: restored log file "000000010000000000000067" from archive
2022-10-24 17:33:38.933 GMT [20] LOG: selected new timeline ID: 2
2022-10-24 17:33:41.553 GMT [20] LOG: archive recovery complete
2022-10-24 17:33:41.895 GMT [19] LOG: database system is ready to accept connections
2022-10-24 17:33:41.899 GMT [174] LOG: archive command failed with exit code 1
2022-10-24 17:33:41.899 GMT [174] DETAIL: The failed archive command was: false
2022-10-24 17:33:42.904 GMT [174] LOG: archive command failed with exit code 1
2022-10-24 17:33:42.904 GMT [174] DETAIL: The failed archive command was: false
2022-10-24 17:33:43.593 GMT [19] LOG: received fast shutdown request
2022-10-24 17:33:43.596 GMT [19] LOG: aborting any active transactions
2022-10-24 17:33:43.608 GMT [19] LOG: background worker "logical replication launcher" (PID 175) exited with exit code 1
2022-10-24 17:33:43.610 GMT [24] LOG: shutting down
2022-10-24 17:33:43.908 GMT [174] LOG: archive command failed with exit code 1
2022-10-24 17:33:43.908 GMT [174] DETAIL: The failed archive command was: false
2022-10-24 17:33:43.908 GMT [174] WARNING: archiving write-ahead log file "00000002.history" failed too many times, will try again later
2022-10-24 17:33:46.139 GMT [174] LOG: archive command failed with exit code 1
2022-10-24 17:33:46.139 GMT [174] DETAIL: The failed archive command was: false
2022-10-24 17:33:46.211 GMT [174] LOG: archive command failed with exit code 1
2022-10-24 17:33:46.211 GMT [174] DETAIL: The failed archive command was: false
2022-10-24 17:33:47.216 GMT [174] LOG: archive command failed with exit code 1
2022-10-24 17:33:47.216 GMT [174] DETAIL: The failed archive command was: false
2022-10-24 17:33:47.216 GMT [174] WARNING: archiving write-ahead log file "00000002.history" failed too many times, will try again later
2022-10-24 17:33:47.219 GMT [174] LOG: archive command failed with exit code 1
2022-10-24 17:33:47.219 GMT [174] DETAIL: The failed archive command was: false
2022-10-24 17:33:48.223 GMT [174] LOG: archive command failed with exit code 1
2022-10-24 17:33:48.223 GMT [174] DETAIL: The failed archive command was: false
2022-10-24 17:33:49.226 GMT [174] LOG: archive command failed with exit code 1
2022-10-24 17:33:49.226 GMT [174] DETAIL: The failed archive command was: false
2022-10-24 17:33:49.226 GMT [174] WARNING: archiving write-ahead log file "00000002.history" failed too many times, will try again later
2022-10-24 17:33:49.230 GMT [19] LOG: database system is shut down
I check the db, nothing was restored.
to be continued...
Scenario-2: Restore from the last backup by using the same name:
here is the config:
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: ferahevler
namespace: psql-ferahevler
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.5-1
postgresVersion: 14
customReplicationTLSSecret:
name: ferahevler-repl-tls
customTLSSecret:
name: ferahevler-tls
dataSource:
pgbackrest:
stanza: db
configuration:
- secret:
name: psql-ferahevler-s3-creds
- secret:
name: ferahevler-pgbackrest-secrets
global:
repo1-path: /repo1/ferahevler
repo1-cipher-type: aes-256-cbc
repo:
name: repo1
s3:
bucket: kmybucketnamehere1
endpoint: s3.eu-central-1.amazonaws.com:443
region: eu-central-1
instances:
- name: pgha1
replicas: 2
dataVolumeClaimSpec:
storageClassName: ebs-sc-enc
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: ferahevler
postgres-operator.crunchydata.com/instance-set: pgha1
monitoring:
pgmonitor:
exporter:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi8-5.2.0-0
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.40-1
configuration:
- secret:
name: psql-ferahevler-s3-creds
- secret:
name: ferahevler-pgbackrest-secrets
manual:
repoName: repo1
options:
- --type=full
global:
repo1-path: /repo1/ferahevler
repo1-retention-full: "14"
repo1-retention-full-type: time
repo1-cipher-type: aes-256-cbc
repos:
- name: repo1
schedules:
full: "0 */5 * * *"
# incremental: "0 */6 * * *"
s3:
bucket: ku-eksdev1-crunchydata-backups
endpoint: s3.eu-central-1.amazonaws.com:443
region: eu-central-1
proxy:
pgBouncer:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:ubi8-1.17-1
replicas: 2
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: ferahevler
postgres-operator.crunchydata.com/role: pgbouncer
...and logs:
$ k logs -n psql-ferahevler ferahevler-pgbackrest-restore-xmhrb
Defaulted container "pgbackrest-restore" out of: pgbackrest-restore, nss-wrapper-init (init)
Error from server (BadRequest): container "pgbackrest-restore" in pod "ferahevler-pgbackrest-restore-xmhrb" is waiting to start: PodInitializing
mkarakas@vm-mkarakas-01:~$ k logs -n psql-ferahevler ferahevler-pgbackrest-restore-xmhrb -f
Defaulted container "pgbackrest-restore" out of: pgbackrest-restore, nss-wrapper-init (init)
+ pgbackrest restore --stanza=db --pg1-path=/pgdata/pg14 --repo=1 --delta --link-map=pg_wal=/pgdata/pg14_wal
WARN: unable to open log file '/pgdata/pgbackrest/log/db-restore.log': No such file or directory
NOTE: process will continue without log file.
WARN: --delta or --force specified but unable to find 'PG_VERSION' or 'backup.manifest' in '/pgdata/pg14' to confirm that this is a valid $PGDATA directory. --delta and --force have been disabled and if any files exist in the destination directories the restore will be aborted.
2022-10-24 17:51:35.964 GMT [19] LOG: starting PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10), 64-bit
2022-10-24 17:51:35.966 GMT [19] LOG: listening on IPv6 address "::1", port 5432
2022-10-24 17:51:35.966 GMT [19] LOG: listening on IPv4 address "127.0.0.1", port 5432
2022-10-24 17:51:35.976 GMT [19] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-10-24 17:51:35.983 GMT [20] LOG: database system was interrupted; last known up at 2022-10-24 17:34:34 GMT
2022-10-24 17:51:36.171 GMT [20] LOG: starting archive recovery
2022-10-24 17:51:36.285 GMT [20] LOG: restored log file "00000002.history" from archive
2022-10-24 17:51:36.613 GMT [20] LOG: restored log file "00000002000000000000006A" from archive
2022-10-24 17:51:36.672 GMT [20] LOG: redo starts at 0/6A000060
2022-10-24 17:51:36.883 GMT [20] LOG: restored log file "00000002000000000000006B" from archive
2022-10-24 17:51:37.098 GMT [20] LOG: restored log file "00000002000000000000006C" from archive
2022-10-24 17:51:37.165 GMT [20] LOG: consistent recovery state reached at 0/6C000088
2022-10-24 17:51:37.166 GMT [19] LOG: database system is ready to accept read-only connections
2022-10-24 17:51:37.429 GMT [20] LOG: restored log file "00000002000000000000006D" from archive
2022-10-24 17:51:37.649 GMT [20] LOG: restored log file "00000002000000000000006E" from archive
2022-10-24 17:51:37.976 GMT [20] LOG: restored log file "00000002000000000000006F" from archive
2022-10-24 17:51:38.086 GMT [20] LOG: redo done at 0/6F000110 system usage: CPU: user: 0.00 s, system: 0.01 s, elapsed: 1.41 s
2022-10-24 17:51:38.086 GMT [20] LOG: last completed transaction was at log time 2022-10-24 17:35:43.740853+00
2022-10-24 17:51:38.231 GMT [20] LOG: restored log file "00000002000000000000006F" from archive
2022-10-24 17:51:38.326 GMT [20] LOG: selected new timeline ID: 3
2022-10-24 17:51:38.409 GMT [20] LOG: archive recovery complete
2022-10-24 17:51:38.673 GMT [20] LOG: restored log file "00000002.history" from archive
2022-10-24 17:51:38.731 GMT [19] LOG: database system is ready to accept connections
2022-10-24 17:51:38.735 GMT [46] LOG: archive command failed with exit code 1
2022-10-24 17:51:38.735 GMT [46] DETAIL: The failed archive command was: false
2022-10-24 17:51:39.739 GMT [46] LOG: archive command failed with exit code 1
2022-10-24 17:51:39.739 GMT [46] DETAIL: The failed archive command was: false
2022-10-24 17:51:40.451 GMT [19] LOG: received fast shutdown request
2022-10-24 17:51:40.455 GMT [19] LOG: aborting any active transactions
2022-10-24 17:51:40.457 GMT [19] LOG: background worker "logical replication launcher" (PID 47) exited with exit code 1
2022-10-24 17:51:40.458 GMT [24] LOG: shutting down
2022-10-24 17:51:40.533 GMT [46] LOG: archive command failed with exit code 1
2022-10-24 17:51:40.533 GMT [46] DETAIL: The failed archive command was: false
2022-10-24 17:51:40.533 GMT [46] WARNING: archiving write-ahead log file "00000003.history" failed too many times, will try again later
2022-10-24 17:51:40.538 GMT [46] LOG: archive command failed with exit code 1
2022-10-24 17:51:40.538 GMT [46] DETAIL: The failed archive command was: false
2022-10-24 17:51:41.544 GMT [46] LOG: archive command failed with exit code 1
2022-10-24 17:51:41.544 GMT [46] DETAIL: The failed archive command was: false
2022-10-24 17:51:42.549 GMT [46] LOG: archive command failed with exit code 1
2022-10-24 17:51:42.549 GMT [46] DETAIL: The failed archive command was: false
2022-10-24 17:51:42.549 GMT [46] WARNING: archiving write-ahead log file "00000003.history" failed too many times, will try again later
2022-10-24 17:51:42.552 GMT [19] LOG: database system is shut down
I check the db, I see that all data in the last backup was restored. (?!)
Scenario-3: different servername, specific backup:
config: psql-restore-pitr kustomize
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: gultepe
namespace: psql-gultepe
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.5-1
postgresVersion: 14
customReplicationTLSSecret:
name: gultepe-repl-tls
customTLSSecret:
name: gultepe-tls
instances:
- name: pgha1
replicas: 2
dataVolumeClaimSpec:
storageClassName: ebs-sc-enc
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: gultepe
postgres-operator.crunchydata.com/instance-set: pgha1
monitoring:
pgmonitor:
exporter:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi8-5.2.0-0
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.40-1
configuration:
- secret:
name: psql-gultepe-s3-creds
manual:
repoName: repo1
options:
- --type=full
restore:
enabled: true
repoName: repo1
options:
- --set=20221024-105409F
- --type=immediate
global:
repo1-path: /repo1/gultepe
repo1-retention-full: "14"
repo1-retention-full-type: time
repos:
- name: repo1
schedules:
full: "0 */5 * * *"
# incremental: "0 */6 * * *"
s3:
bucket: mybucketnamehere
endpoint: s3.eu-central-1.amazonaws.com:443
region: eu-central-1
proxy:
pgBouncer:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:ubi8-1.17-1
replicas: 2
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: gultepe
postgres-operator.crunchydata.com/role: pgbouncer
There is no restore.
Scenario-4: psql-restore-master-enc-v2
Trying restore from a specific backup: (same name)
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: ferahevler
namespace: psql-ferahevler
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.5-1
postgresVersion: 14
customReplicationTLSSecret:
name: ferahevler-repl-tls
customTLSSecret:
name: ferahevler-tls
dataSource:
postgresCluster:
clusterName: ferahevler
repoName: repo1
options:
- --set=20221024-105409F
- --type=immediate
instances:
- name: pgha1
replicas: 2
dataVolumeClaimSpec:
storageClassName: ebs-sc-enc
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: ferahevler
postgres-operator.crunchydata.com/instance-set: pgha1
monitoring:
pgmonitor:
exporter:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi8-5.2.0-0
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.40-1
configuration:
- secret:
name: psql-ferahevler-s3-creds
- secret:
name: ferahevler-pgbackrest-secrets
manual:
repoName: repo1
options:
- --type=full
global:
repo1-path: /repo1/ferahevler
repo1-retention-full: "14"
repo1-retention-full-type: time
repo1-cipher-type: aes-256-cbc
repos:
- name: repo1
schedules:
full: "0 */5 * * *"
# incremental: "0 */6 * * *"
s3:
bucket: ku-eksdev1-crunchydata-backups
endpoint: s3.eu-central-1.amazonaws.com:443
region: eu-central-1
proxy:
pgBouncer:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:ubi8-1.17-1
replicas: 2
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: ferahevler
postgres-operator.crunchydata.com/role: pgbouncer
logs:
$ k logs -n psql-ferahevler ferahevler-pgbackrest-restore-r56m2 -f
Defaulted container "pgbackrest-restore" out of: pgbackrest-restore, nss-wrapper-init (init)
+ pgbackrest restore --set=20221024-105409F --type=immediate --stanza=db --pg1-path=/pgdata/pg14 --repo=1 --delta --link-map=pg_wal=/pgdata/pg14_wal
WARN: unable to open log file '/pgdata/pgbackrest/log/db-restore.log': No such file or directory
NOTE: process will continue without log file.
WARN: --delta or --force specified but unable to find 'PG_VERSION' or 'backup.manifest' in '/pgdata/pg14' to confirm that this is a valid $PGDATA directory. --delta and --force have been disabled and if any files exist in the destination directories the restore will be aborted.
2022-10-24 18:24:29.989 GMT [18] LOG: starting PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10), 64-bit
2022-10-24 18:24:29.990 GMT [18] LOG: listening on IPv6 address "::1", port 5432
2022-10-24 18:24:29.990 GMT [18] LOG: listening on IPv4 address "127.0.0.1", port 5432
2022-10-24 18:24:29.997 GMT [18] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-10-24 18:24:30.004 GMT [19] LOG: database system was interrupted; last known up at 2022-10-24 10:54:13 GMT
2022-10-24 18:24:30.018 GMT [19] LOG: starting point-in-time recovery to earliest consistent point
2022-10-24 18:24:30.296 GMT [19] LOG: restored log file "00000001000000000000000E" from archive
2022-10-24 18:24:30.364 GMT [19] LOG: redo starts at 0/E000028
2022-10-24 18:24:30.368 GMT [19] LOG: consistent recovery state reached at 0/E000138
2022-10-24 18:24:30.368 GMT [18] LOG: database system is ready to accept read-only connections
2022-10-24 18:24:30.368 GMT [19] LOG: recovery stopping after reaching consistency
2022-10-24 18:24:30.368 GMT [19] LOG: pausing at the end of recovery
2022-10-24 18:24:30.368 GMT [19] HINT: Execute pg_wal_replay_resume() to promote.
2022-10-24 18:24:30.488 GMT [19] LOG: redo done at 0/E000138 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.12 s
2022-10-24 18:24:30.646 GMT [19] LOG: restored log file "00000002.history" from archive
2022-10-24 18:24:30.819 GMT [19] LOG: restored log file "00000003.history" from archive
2022-10-24 18:24:30.995 GMT [19] LOG: restored log file "00000004.history" from archive
2022-10-24 18:24:31.138 GMT [19] LOG: selected new timeline ID: 5
2022-10-24 18:24:31.217 GMT [19] LOG: archive recovery complete
2022-10-24 18:24:31.437 GMT [18] LOG: database system is ready to accept connections
2022-10-24 18:24:31.451 GMT [35] LOG: archive command failed with exit code 1
2022-10-24 18:24:31.451 GMT [35] DETAIL: The failed archive command was: false
2022-10-24 18:24:32.457 GMT [35] LOG: archive command failed with exit code 1
2022-10-24 18:24:32.457 GMT [35] DETAIL: The failed archive command was: false
2022-10-24 18:24:32.541 GMT [18] LOG: received fast shutdown request
2022-10-24 18:24:32.544 GMT [18] LOG: aborting any active transactions
2022-10-24 18:24:32.549 GMT [18] LOG: background worker "logical replication launcher" (PID 36) exited with exit code 1
2022-10-24 18:24:32.551 GMT [21] LOG: shutting down
2022-10-24 18:24:32.656 GMT [35] LOG: archive command failed with exit code 1
2022-10-24 18:24:32.656 GMT [35] DETAIL: The failed archive command was: false
2022-10-24 18:24:32.656 GMT [35] WARNING: archiving write-ahead log file "00000005.history" failed too many times, will try again later
2022-10-24 18:24:32.660 GMT [35] LOG: archive command failed with exit code 1
2022-10-24 18:24:32.660 GMT [35] DETAIL: The failed archive command was: false
2022-10-24 18:24:33.667 GMT [35] LOG: archive command failed with exit code 1
2022-10-24 18:24:33.667 GMT [35] DETAIL: The failed archive command was: false
2022-10-24 18:24:34.672 GMT [35] LOG: archive command failed with exit code 1
2022-10-24 18:24:34.672 GMT [35] DETAIL: The failed archive command was: false
2022-10-24 18:24:34.672 GMT [35] WARNING: archiving write-ahead log file "00000005.history" failed too many times, will try again later
2022-10-24 18:24:34.676 GMT [18] LOG: database system is shut down
I check the db and restore is successful. (?!)
Scenario-5: That one is the same scenario with Scenario-4 but different clustername.
config:
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: seyrantepe
namespace: psql-seyrantepe
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.5-1
postgresVersion: 14
customReplicationTLSSecret:
name: seyrantepe-repl-tls
customTLSSecret:
name: seyrantepe-tls
dataSource:
postgresCluster:
clusterName: ferahevler
repoName: repo1
options:
- --set=20221024-105409F
- --type=immediate
instances:
- name: pgha1
replicas: 2
dataVolumeClaimSpec:
storageClassName: ebs-sc-enc
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: seyrantepe
postgres-operator.crunchydata.com/instance-set: pgha1
monitoring:
pgmonitor:
exporter:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi8-5.2.0-0
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.40-1
configuration:
- secret:
name: psql-seyrantepe-s3-creds
- secret:
name: seyrantepe-pgbackrest-secrets
manual:
repoName: repo1
options:
- --type=full
global:
repo1-path: /repo1/seyrantepe
repo1-retention-full: "14"
repo1-retention-full-type: time
repo1-cipher-type: aes-256-cbc
repos:
- name: repo1
schedules:
full: "0 */5 * * *"
# incremental: "0 */6 * * *"
s3:
bucket: mybucketnamehere
endpoint: s3.eu-central-1.amazonaws.com:443
region: eu-central-1
proxy:
pgBouncer:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:ubi8-1.17-1
replicas: 2
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: seyrantepe
postgres-operator.crunchydata.com/role: pgbouncer
logs:
$ k get events -n psql-seyrantepe
LAST SEEN TYPE REASON OBJECT MESSAGE
39s Normal cert-manager.io certificaterequest/seyrantepe-primary-ht8br Certificate request has been approved by cert-manager.io
39s Normal CertificateIssued certificaterequest/seyrantepe-primary-ht8br Certificate fetched from issuer successfully
39s Normal Issuing certificate/seyrantepe-primary Issuing certificate as Secret does not exist
39s Normal Generated certificate/seyrantepe-primary Stored new private key in temporary Secret resource "seyrantepe-primary-rp6dz"
39s Normal Requested certificate/seyrantepe-primary Created new CertificateRequest resource "seyrantepe-primary-ht8br"
39s Normal Issuing certificate/seyrantepe-primary The certificate has been successfully issued
39s Normal cert-manager.io certificaterequest/seyrantepe-replication-6fhjk Certificate request has been approved by cert-manager.io
39s Normal CertificateIssued certificaterequest/seyrantepe-replication-6fhjk Certificate fetched from issuer successfully
39s Normal Issuing certificate/seyrantepe-replication Issuing certificate as Secret does not exist
39s Normal Generated certificate/seyrantepe-replication Stored new private key in temporary Secret resource "seyrantepe-replication-gljqn"
39s Normal Requested certificate/seyrantepe-replication Created new CertificateRequest resource "seyrantepe-replication-6fhjk"
39s Normal Issuing certificate/seyrantepe-replication The certificate has been successfully issued
38s Warning InvalidDataSource postgrescluster/seyrantepe PostgresCluster "ferahevler" does not exist
I am totaly lost :( .
Scenario-6: This is the same scenario with the Scenario-1 but this time I try to restore by using the same name:
config:
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: ferahevler
namespace: psql-ferahevler
spec:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.5-1
postgresVersion: 14
customReplicationTLSSecret:
name: ferahevler-repl-tls
customTLSSecret:
name: ferahevler-tls
dataSource:
pgbackrest:
stanza: db
configuration:
- secret:
name: psql-ferahevler-s3-creds
- secret:
name: ferahevler-pgbackrest-secrets
global:
repo1-path: /repo1/ferahevler
repo1-cipher-type: aes-256-cbc
repo:
name: repo1
s3:
bucket: ku-eksdev1-crunchydata-backups
endpoint: s3.eu-central-1.amazonaws.com:443
region: eu-central-1
users:
- name: postgres
- name: ferahevler
databases:
- ferahevler
instances:
- name: pgha1
replicas: 2
dataVolumeClaimSpec:
storageClassName: ebs-sc-enc
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: ferahevler
postgres-operator.crunchydata.com/instance-set: pgha1
monitoring:
pgmonitor:
exporter:
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi8-5.2.0-0
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.40-1
configuration:
- secret:
name: psql-ferahevler-s3-creds
- secret:
name: ferahevler-pgbackrest-secrets
manual:
repoName: repo1
options:
- --type=full
global:
repo1-path: /repo1/ferahevler
repo1-retention-full: "30"
repo1-retention-full-type: time
repo1-cipher-type: aes-256-cbc
repos:
- name: repo1
schedules:
full: "0 */5 * * *"
# incremental: "0 */6 * * *"
s3:
bucket: mybucketnamehere
endpoint: s3.eu-central-1.amazonaws.com:443
region: eu-central-1
proxy:
pgBouncer:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:ubi8-1.17-1
replicas: 2
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: ferahevler
postgres-operator.crunchydata.com/role: pgbouncer
logs:
$ k logs -n psql-ferahevler ferahevler-pgbackrest-restore-bgt4r -f
Defaulted container "pgbackrest-restore" out of: pgbackrest-restore, nss-wrapper-init (init)
+ pgbackrest restore --stanza=db --pg1-path=/pgdata/pg14 --repo=1 --delta --link-map=pg_wal=/pgdata/pg14_wal
WARN: unable to open log file '/pgdata/pgbackrest/log/db-restore.log': No such file or directory
NOTE: process will continue without log file.
WARN: --delta or --force specified but unable to find 'PG_VERSION' or 'backup.manifest' in '/pgdata/pg14' to confirm that this is a valid $PGDATA directory. --delta and --force have been disabled and if any files exist in the destination directories the restore will be aborted.
2022-10-24 19:12:31.474 GMT [19] LOG: starting PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10), 64-bit
2022-10-24 19:12:31.480 GMT [19] LOG: listening on IPv6 address "::1", port 5432
2022-10-24 19:12:31.480 GMT [19] LOG: listening on IPv4 address "127.0.0.1", port 5432
2022-10-24 19:12:31.486 GMT [19] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-10-24 19:12:31.517 GMT [20] LOG: database system was interrupted; last known up at 2022-10-24 19:07:45 GMT
2022-10-24 19:12:31.705 GMT [20] LOG: starting archive recovery
2022-10-24 19:12:31.797 GMT [20] LOG: restored log file "00000006.history" from archive
2022-10-24 19:12:32.010 GMT [20] LOG: restored log file "000000060000000000000017" from archive
2022-10-24 19:12:32.059 GMT [20] LOG: redo starts at 0/17000028
2022-10-24 19:12:32.209 GMT [20] LOG: restored log file "000000060000000000000018" from archive
2022-10-24 19:12:32.409 GMT [20] LOG: restored log file "000000060000000000000019" from archive
2022-10-24 19:12:32.461 GMT [20] LOG: consistent recovery state reached at 0/19000050
2022-10-24 19:12:32.462 GMT [19] LOG: database system is ready to accept read-only connections
2022-10-24 19:12:32.544 GMT [20] LOG: redo done at 0/19000050 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.48 s
2022-10-24 19:12:32.544 GMT [20] LOG: last completed transaction was at log time 2022-10-24 19:07:30.807409+00
2022-10-24 19:12:32.741 GMT [20] LOG: restored log file "000000060000000000000019" from archive
2022-10-24 19:12:32.836 GMT [20] LOG: selected new timeline ID: 7
2022-10-24 19:12:32.890 GMT [20] LOG: archive recovery complete
2022-10-24 19:12:33.001 GMT [20] LOG: restored log file "00000006.history" from archive
2022-10-24 19:12:33.062 GMT [19] LOG: database system is ready to accept connections
2022-10-24 19:12:33.069 GMT [39] LOG: archive command failed with exit code 1
2022-10-24 19:12:33.069 GMT [39] DETAIL: The failed archive command was: false
2022-10-24 19:12:34.074 GMT [39] LOG: archive command failed with exit code 1
2022-10-24 19:12:34.074 GMT [39] DETAIL: The failed archive command was: false
2022-10-24 19:12:34.603 GMT [19] LOG: received fast shutdown request
2022-10-24 19:12:34.605 GMT [19] LOG: aborting any active transactions
2022-10-24 19:12:34.608 GMT [19] LOG: background worker "logical replication launcher" (PID 40) exited with exit code 1
2022-10-24 19:12:34.609 GMT [24] LOG: shutting down
2022-10-24 19:12:34.663 GMT [39] LOG: archive command failed with exit code 1
2022-10-24 19:12:34.663 GMT [39] DETAIL: The failed archive command was: false
2022-10-24 19:12:34.663 GMT [39] WARNING: archiving write-ahead log file "00000007.history" failed too many times, will try again later
2022-10-24 19:12:34.667 GMT [39] LOG: archive command failed with exit code 1
2022-10-24 19:12:34.667 GMT [39] DETAIL: The failed archive command was: false
2022-10-24 19:12:35.672 GMT [39] LOG: archive command failed with exit code 1
2022-10-24 19:12:35.672 GMT [39] DETAIL: The failed archive command was: false
2022-10-24 19:12:36.691 GMT [39] LOG: archive command failed with exit code 1
2022-10-24 19:12:36.691 GMT [39] DETAIL: The failed archive command was: false
2022-10-24 19:12:36.692 GMT [39] WARNING: archiving write-ahead log file "00000007.history" failed too many times, will try again later
2022-10-24 19:12:36.710 GMT [19] LOG: database system is shut down
I check the db and restore is successful. The data is there.
Whilst it cannot be restored different, it can be restored if I use the same name.
Hello @tirelibirefe !
It looks like in some scenarios, you're attempting to clone from backup data and in others, you're trying to clone from an existing cluster. Different things can go wrong, depending on the scenario.
I'd like to focus on Scenario 1 to understand what's going wrong. I've noticed that the section for specifying your data source spec.dataSource.pgbackrest
is configured in the same manner as the section for backing up data in the new cluster spec.backups.pgbackrest.repos
.
Under your data source:
global:
repo1-path: /repo1/ferahevler
repo1-cipher-type: aes-256-cbc
repo:
name: repo1
s3:
bucket: ku-eksdev1-crunchydata-backups
endpoint: s3.eu-central-1.amazonaws.com:443
region: eu-central-1
Under the new cluster's backups section:
global:
repo1-path: /repo1/ferahevler
repo1-cipher-type: aes-256-cbc
...
repos:
...
s3:
bucket: ku-eksdev1-crunchydata-backups
endpoint: s3.eu-central-1.amazonaws.com:443
region: eu-central-1
It would be better to backup the new cluster, seyrantepe, to a different path (like /repo1/seyrantepe
) to make sure the data from ferahevler isn't accidentally overwritten.
As you've probably noticed, the WARNING and failed archive messages do not indicate a failure to restore. Your logs indicate that archives are being restored and that the job is succeeding. When you've experienced a failure, what commands have you used to see that your data hasn't been restored? And what is the output from those commands?
Hello, pgo 5.1 psql 14 eks 1.2
I followed the recipe here which is the topic "Clone From Backups Stored in S3 / GCS / Azure Blob Storage".
I had a psql cluster named "ferahevler" and its backups in S3.
I deleted "feraehevler" and I would like restore/clone it from S3 to a new psql cluster named "seyrantepe".
here is the config:
When I run the new cluster installation, I get the following error message in restore pod:
Restore cannot be completed successfully. Is there a workaround you can advise?
Thanks & Regards