Closed necipakca closed 2 years ago
It was copy-paste issue. I have forget this parameters on my patroni conf so there was not backup of wal archives on s3. After delete these paramters everything works fine. archive_command: /bin/true archive_mode: "on"
Please, answer some short questions which should help us to understand your problem / question better?
Which image of the operator are you using? registry.opensource.zalan.do/acid/postgres-operator:v1.8.2
Where do you run it - cloud or metal? Bare Metal K8s
Are you running Postgres Operator in production? no
Type of issue? Bug report
Spilo Version 14:2.1-p6
I can successfully take backups but can not restore it.
2022-08-12 14:03:02,795 - bootstrapping - INFO - Figuring out my environment (Google? AWS? Openstack? Local?) 2022-08-12 14:03:04,800 - bootstrapping - INFO - Could not connect to 169.254.169.254, assuming local Docker setup 2022-08-12 14:03:04,801 - bootstrapping - INFO - No meta-data available for this provider 2022-08-12 14:03:04,801 - bootstrapping - INFO - Looks like your running local 2022-08-12 14:03:04,825 - bootstrapping - INFO - Configuring log 2022-08-12 14:03:04,825 - bootstrapping - INFO - Configuring standby-cluster 2022-08-12 14:03:04,825 - bootstrapping - INFO - Configuring certificate 2022-08-12 14:03:04,825 - bootstrapping - INFO - Generating ssl self-signed certificate 2022-08-12 14:03:04,873 - bootstrapping - INFO - Configuring crontab 2022-08-12 14:03:04,874 - bootstrapping - INFO - Skipping creation of renice cron job due to lack of SYS_NICE capability 2022-08-12 14:03:04,879 - bootstrapping - INFO - Configuring pam-oauth2 2022-08-12 14:03:04,880 - bootstrapping - INFO - Writing to file /etc/pam.d/postgresql 2022-08-12 14:03:04,880 - bootstrapping - INFO - Configuring pgqd 2022-08-12 14:03:04,880 - bootstrapping - INFO - Configuring patroni 2022-08-12 14:03:04,887 - bootstrapping - INFO - Writing to file /run/postgres.yml 2022-08-12 14:03:04,887 - bootstrapping - INFO - Configuring bootstrap 2022-08-12 14:03:04,888 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env-clone-vmyd-system-dbs/WALE_S3_PREFIX 2022-08-12 14:03:04,888 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env-clone-vmyd-system-dbs/WALG_S3_PREFIX 2022-08-12 14:03:04,888 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env-clone-vmyd-system-dbs/AWS_ACCESS_KEY_ID 2022-08-12 14:03:04,888 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env-clone-vmyd-system-dbs/AWS_SECRET_ACCESS_KEY 2022-08-12 14:03:04,888 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env-clone-vmyd-system-dbs/WALE_S3_ENDPOINT 2022-08-12 14:03:04,888 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env-clone-vmyd-system-dbs/AWS_ENDPOINT 2022-08-12 14:03:04,888 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env-clone-vmyd-system-dbs/WALE_DISABLE_S3_SSE 2022-08-12 14:03:04,888 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env-clone-vmyd-system-dbs/WALG_DISABLE_S3_SSE 2022-08-12 14:03:04,888 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env-clone-vmyd-system-dbs/AWS_S3_FORCE_PATH_STYLE 2022-08-12 14:03:04,888 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env-clone-vmyd-system-dbs/WALE_LOG_DESTINATION 2022-08-12 14:03:04,890 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env-clone-vmyd-system-dbs/TMPDIR 2022-08-12 14:03:04,890 - bootstrapping - INFO - Configuring pgbouncer 2022-08-12 14:03:04,890 - bootstrapping - INFO - No PGBOUNCER_CONFIGURATION was specified, skipping 2022-08-12 14:03:04,890 - bootstrapping - INFO - Configuring wal-e 2022-08-12 14:03:04,890 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env/WALE_S3_PREFIX 2022-08-12 14:03:04,890 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env/WALG_S3_PREFIX 2022-08-12 14:03:04,891 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env/AWS_ACCESS_KEY_ID 2022-08-12 14:03:04,891 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env/AWS_SECRET_ACCESS_KEY 2022-08-12 14:03:04,891 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env/WALE_S3_ENDPOINT 2022-08-12 14:03:04,891 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env/AWS_ENDPOINT 2022-08-12 14:03:04,891 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env/WALE_DISABLE_S3_SSE 2022-08-12 14:03:04,891 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env/WALG_DISABLE_S3_SSE 2022-08-12 14:03:04,891 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env/AWS_S3_FORCE_PATH_STYLE 2022-08-12 14:03:04,891 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env/WALG_DOWNLOAD_CONCURRENCY 2022-08-12 14:03:04,891 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env/WALG_UPLOAD_CONCURRENCY 2022-08-12 14:03:04,891 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env/WALE_LOG_DESTINATION 2022-08-12 14:03:04,892 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env/PGPORT 2022-08-12 14:03:04,892 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env/BACKUP_NUM_TO_RETAIN 2022-08-12 14:03:04,892 - bootstrapping - INFO - Writing to file /run/etc/wal-e.d/env/TMPDIR 2022-08-12 14:03:06,082 WARNING: Kubernetes RBAC doesn't allow GET access to the 'kubernetes' endpoint in the 'default' namespace. Disabling 'bypass_api_service'. 2022-08-12 14:03:06,096 INFO: No PostgreSQL configuration items changed, nothing to reload. 2022-08-12 14:03:06,097 INFO: Lock owner: None; I am vmyd-test-0 2022-08-12 14:03:06,132 INFO: trying to bootstrap a new cluster 2022-08-12 14:03:06,132 INFO: Running custom bootstrap script: envdir "/run/etc/wal-e.d/env-clone-vmyd-system-dbs" python3 /scripts/clone_with_wale.py --recovery-target-time="2023-08-12T08:01:41+00:00" 2022-08-12 14:03:06,245 INFO: Trying s3://backups/spilo/vmyd-system-dbs/wal/13/ for clone wal_e.main INFO MSG: starting WAL-E DETAIL: The subcommand is "backup-list". STRUCTURED: time=2022-08-12T14:03:06.351120-00 pid=78 2022-08-12 14:03:06,405 INFO: cloning cluster vmyd-test using wal-e backup-fetch /home/postgres/pgdata/pgroot/data base_000000010000000000000006_00000040 wal_e.main INFO MSG: starting WAL-E DETAIL: The subcommand is "backup-fetch". STRUCTURED: time=2022-08-12T14:03:06.508045-00 pid=82 wal_e.worker.s3.s3_worker INFO MSG: beginning partition download DETAIL: The partition being downloaded is part_00000000.tar.lzo. HINT: The absolute S3 key is spilo/vmyd-system-dbs/wal/13/basebackups_005/base_000000010000000000000006_00000040/tar_partitions/part_00000000.tar.lzo. STRUCTURED: time=2022-08-12T14:03:06.649936-00 pid=82 wal_e.worker.s3.s3_worker INFO MSG: beginning partition download DETAIL: The partition being downloaded is part_00000001.tar.lzo. HINT: The absolute S3 key is spilo/vmyd-system-dbs/wal/13/basebackups_005/base_000000010000000000000006_00000040/tar_partitions/part_00000001.tar.lzo. STRUCTURED: time=2022-08-12T14:03:06.650567-00 pid=82 2022-08-12 14:03:07,795 maybe_pg_upgrade INFO: No PostgreSQL configuration items changed, nothing to reload. 2022-08-12 14:03:07,807 maybe_pg_upgrade INFO: Cluster version: 13, bin version: 13 2022-08-12 14:03:07,807 maybe_pg_upgrade INFO: Trying to perform point-in-time recovery 2022-08-12 14:03:07,807 maybe_pg_upgrade INFO: Running custom bootstrap script: true 2022-08-12 14:03:07 UTC [103]: [1-1] 62f65d9b.67 0 LOG: Auto detecting pg_stat_kcache.linux_hz parameter... 2022-08-12 14:03:07 UTC [103]: [2-1] 62f65d9b.67 0 LOG: pg_stat_kcache.linux_hz is set to 1000000 2022-08-12 14:03:07,946 maybe_pg_upgrade INFO: postmaster pid=103 /var/run/postgresql:5432 - no response 2022-08-12 14:03:07 UTC [103]: [3-1] 62f65d9b.67 0 LOG: redirecting log output to logging collector process 2022-08-12 14:03:07 UTC [103]: [4-1] 62f65d9b.67 0 HINT: Future log output will appear in directory "../pg_log". 2022-08-12 14:03:08,955 maybe_pg_upgrade ERROR: postmaster is not running
Traceback (most recent call last): File "/scripts/maybe_pg_upgrade.py", line 51, in perform_pitr raise Exception('Point-in-time recovery failed') Exception: Point-in-time recovery failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/scripts/maybe_pg_upgrade.py", line 140, in
main()
File "/scripts/maybe_pg_upgrade.py", line 87, in main
perform_pitr(upgrade, cluster_version, bin_version, config['bootstrap'])
File "/scripts/maybe_pg_upgrade.py", line 70, in perform_pitr
raise Exception('Point-in-time recovery failed.\nLOGS:\n--\n' + logs)
Exception: Point-in-time recovery failed.
LOGS:
If you are not restoring from a backup, try removing the file ""/home/postgres/pgdata/pgroot/data/backup_label"". Be careful: removing ""/home/postgres/pgdata/pgroot/data/backup_label"" will result in a corrupt cluster if restoring from a backup.",,,,,,,"","startup" 2022-08-12 14:03:08.006 UTC,,,103,,62f65d9b.67,6,,2022-08-12 14:03:07 UTC,,0,LOG,00000,"startup process (PID 106) exited with exit code 1",,,,,,,,,"","postmaster" 2022-08-12 14:03:08.006 UTC,,,103,,62f65d9b.67,7,,2022-08-12 14:03:07 UTC,,0,LOG,00000,"aborting startup due to startup process failure",,,,,,,,,"","postmaster" 2022-08-12 14:03:08.007 UTC,,,103,,62f65d9b.67,8,,2022-08-12 14:03:07 UTC,,0,LOG,00000,"database system is shut down",,,,,,,,,"","postmaster"
2022-08-12 14:03:08,975 ERROR: /scripts/maybe_pg_upgrade.py script failed 2022-08-12 14:03:08,984 INFO: removing initialize key after failed attempt to bootstrap the cluster 2022-08-12 14:03:08,999 INFO: renaming data directory to /home/postgres/pgdata/pgroot/data_2022-08-12-14-03-08 Traceback (most recent call last): File "/usr/local/bin/patroni", line 11, in
sys.exit(main())
File "/usr/local/lib/python3.6/dist-packages/patroni/main.py", line 143, in main
return patroni_main()
File "/usr/local/lib/python3.6/dist-packages/patroni/main.py", line 135, in patroni_main
abstract_main(Patroni, schema)
File "/usr/local/lib/python3.6/dist-packages/patroni/daemon.py", line 100, in abstract_main
controller.run()
File "/usr/local/lib/python3.6/dist-packages/patroni/main.py", line 105, in run
super(Patroni, self).run()
File "/usr/local/lib/python3.6/dist-packages/patroni/daemon.py", line 59, in run
self._run_cycle()
File "/usr/local/lib/python3.6/dist-packages/patroni/main.py", line 108, in _run_cycle
logger.info(self.ha.run_cycle())
File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1514, in run_cycle
info = self._run_cycle()
File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1388, in _run_cycle
return self.post_bootstrap()
File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1280, in post_bootstrap
self.cancel_initialization()
File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1273, in cancel_initialization
raise PatroniFatalException('Failed to bootstrap cluster')
patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster'
/etc/runit/runsvdir/default/patroni: finished with code=1 signal=0
/etc/runit/runsvdir/default/patroni: sleeping 30 seconds
POSTGRES LOGS
2022-08-12 14:26:27.947 UTC,,,277,,62f66313.115,1,,2022-08-12 14:26:27 UTC,,0,LOG,00000,"ending log output to stderr",,"Future log output will go to log destination ""csvlog"".",,,,,,,"","postmaster" 2022-08-12 14:26:27.947 UTC,,,277,,62f66313.115,2,,2022-08-12 14:26:27 UTC,,0,LOG,00000,"starting PostgreSQL 13.7 (Ubuntu 13.7-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit",,,,,,,,,"","postmaster" 2022-08-12 14:26:27.947 UTC,,,277,,62f66313.115,3,,2022-08-12 14:26:27 UTC,,0,LOG,00000,"listening on IPv4 address ""0.0.0.0"", port 5432",,,,,,,,,"","postmaster" 2022-08-12 14:26:27.947 UTC,,,277,,62f66313.115,4,,2022-08-12 14:26:27 UTC,,0,LOG,00000,"listening on IPv6 address ""::"", port 5432",,,,,,,,,"","postmaster" 2022-08-12 14:26:27.950 UTC,,,277,,62f66313.115,5,,2022-08-12 14:26:27 UTC,,0,LOG,00000,"listening on Unix socket ""/var/run/postgresql/.s.PGSQL.5432""",,,,,,,,,"","postmaster" 2022-08-12 14:26:27.954 UTC,,,280,,62f66313.118,1,,2022-08-12 14:26:27 UTC,,0,LOG,00000,"database system was interrupted; last known up at 2022-08-12 14:15:04 UTC",,,,,,,,,"","startup" 2022-08-12 14:26:27.954 UTC,,,280,,62f66313.118,2,,2022-08-12 14:26:27 UTC,,0,LOG,00000,"creating missing WAL directory ""pg_wal/archive_status""",,,,,,,,,"","startup" 2022-08-12 14:26:27.971 UTC,,,280,,62f66313.118,3,,2022-08-12 14:26:27 UTC,,0,LOG,00000,"starting point-in-time recovery to 2023-08-12 08:01:41+00",,,,,,,,,"","startup" 2022-08-12 14:26:27.979 UTC,,,280,,62f66313.118,4,,2022-08-12 14:26:27 UTC,,0,LOG,00000,"invalid checkpoint record",,,,,,,,,"","startup" 2022-08-12 14:26:27.979 UTC,,,280,,62f66313.118,5,,2022-08-12 14:26:27 UTC,,0,FATAL,XX000,"could not locate required checkpoint record",,"If you are restoring from a backup, touch ""/home/postgres/pgdata/pgroot/data/recovery.signal"" and add required recovery options. If you are not restoring from a backup, try removing the file ""/home/postgres/pgdata/pgroot/data/backup_label"". Be careful: removing ""/home/postgres/pgdata/pgroot/data/backup_label"" will result in a corrupt cluster if restoring from a backup.",,,,,,,"","startup" 2022-08-12 14:26:27.980 UTC,,,277,,62f66313.115,6,,2022-08-12 14:26:27 UTC,,0,LOG,00000,"startup process (PID 280) exited with exit code 1",,,,,,,,,"","postmaster" 2022-08-12 14:26:27.980 UTC,,,277,,62f66313.115,7,,2022-08-12 14:26:27 UTC,,0,LOG,00000,"aborting startup due to startup process failure",,,,,,,,,"","postmaster" 2022-08-12 14:26:27.982 UTC,,,277,,62f66313.115,8,,2022-08-12 14:26:27 UTC,,0,LOG,00000,"database system is shut down",,,,,,,,,"","postmaster"