zalando / postgres-operator

Postgres operator creates and manages PostgreSQL clusters running in Kubernetes
https://postgres-operator.readthedocs.io/
MIT License
4.11k stars 949 forks source link

Backup with `ghcr.io/zalando/postgres-operator/logical-backup:v1.12.2` fails #2683

Closed jonathon2nd closed 3 days ago

jonathon2nd commented 3 days ago

Please, answer some short questions which should help us to understand your problem / question better?

I have looked at the script, https://github.com/zalando/postgres-operator/blob/v1.12.2/logical-backup/dump.sh. It looks to already be using dumpall, so not sure what the issue is?

Logs:

IPv4
2024-07-01T19:39:53.392981764Z API Endpoint: https://10.3.0.1:443/api/v1
2024-07-01T19:39:53.412770214Z   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
2024-07-01T19:39:53.412781054Z                                  Dload  Upload   Total   Spent    Left  Speed
2024-07-01T19:39:53.455647361Z 
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 19480    0 19480    0     0   436k      0 --:--:-- --:--:-- --:--:--  442k
2024-07-01T19:39:53.464259553Z   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
2024-07-01T19:39:53.464271345Z                                  Dload  Upload   Total   Spent    Left  Speed
2024-07-01T19:39:53.514821738Z 
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   118  100   118    0     0   2295      0 --:--:-- --:--:-- --:--:--  2313
2024-07-01T19:39:53.523263158Z   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
2024-07-01T19:39:53.523273407Z                                  Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 57907    0 57907    0     0  1077k      0 --:--:-- --:--:-- --:--:-- 1087k
2024-07-01T19:39:53.579086316Z + '[' s3 == az ']'
2024-07-01T19:39:53.579100685Z + dump
2024-07-01T19:39:53.579103299Z + /usr/lib/postgresql/14/bin/pg_dumpall
2024-07-01T19:39:53.579105203Z + compress
2024-07-01T19:39:53.579106866Z + pigz
2024-07-01T19:39:53.579108479Z + upload
2024-07-01T19:39:53.579110103Z + case $LOGICAL_BACKUP_PROVIDER in
2024-07-01T19:39:53.579111786Z ++ estimate_size
++ /usr/lib/postgresql/14/bin/psql -tqAc 'select sum(pg_database_size(datname)::numeric) from pg_database;'
2024-07-01T19:39:53.613034016Z + aws_upload 17400386940
2024-07-01T19:39:53.613047422Z + declare -r EXPECTED_SIZE=17400386940
2024-07-01T19:39:53.613049736Z /dump.sh: line 110: LOGICAL_BACKUP_S3_BUCKET_PREFIX: unbound variable
2024-07-01T19:39:57.942662442Z pg_dump: warning: there are circular foreign-key constraints on this table:
2024-07-01T19:39:57.942723221Z pg_dump:   hypertable
2024-07-01T19:39:57.942728511Z pg_dump: You might not be able to restore the dump without using --disable-triggers or temporarily dropping the constraints.
2024-07-01T19:39:57.942731667Z pg_dump: Consider using a full dump instead of a --data-only dump to avoid this problem.
2024-07-01T19:39:57.942733831Z pg_dump: warning: there are circular foreign-key constraints on this table:
2024-07-01T19:39:57.942735776Z pg_dump:   chunk
2024-07-01T19:39:57.942737919Z pg_dump: You might not be able to restore the dump without using --disable-triggers or temporarily dropping the constraints.
2024-07-01T19:39:57.942740745Z pg_dump: Consider using a full dump instead of a --data-only dump to avoid this problem.
2024-07-01T19:39:57.942743200Z pg_dump: warning: there are circular foreign-key constraints on this table:
pg_dump:   continuous_agg
2024-07-01T19:39:57.942747538Z pg_dump: You might not be able to restore the dump without using --disable-triggers or temporarily dropping the constraints.
2024-07-01T19:39:57.942750103Z pg_dump: Consider using a full dump instead of a --data-only dump to avoid this problem.
2024-07-01T19:39:59.140926437Z pg_dumpall: error: pg_dump failed on database "production", exiting
jonathon2nd commented 3 days ago

This is not DB specific. I am seeing this error across other DBs.

jonathon2nd commented 3 days ago

I am seeing the same issue with registry.opensource.zalan.do/acid/logical-backup:v1.12.0, reverting to registry.opensource.zalan.do/acid/logical-backup:v1.11.0 works

jonathon2nd commented 3 days ago

Ah I see what happened. Found the path to the old script, https://github.com/zalando/postgres-operator/blob/v1.11.0/docker/logical-backup/dump.sh, and see that LOGICAL_BACKUP_S3_BUCKET_PREFIX is different.

I then went to the cronjobs and compared. I had been manually managing the cronjobs as this was not a thing yet. I will add this new value and test.

jonathon2nd commented 3 days ago

Yes, that was it. My bad. Leaving here in case there is at least one other person who may have done the same thing.