Closed alrooney closed 3 years ago
In an emergency situation, you can manually execute pgbackrest expire on the Pod. I would suggest expiring the oldest backup.
Please note all of the warnings associated with running this command.
Thanks - so I should exec into the
Looking through the pgbackrest docs I'm confused how to run that expire command. I did try running a backup as follows but this did not work:
pgo backup retroelk-prod1-azure --backup-opts="--type=full --repo1-retention-full=7 expire" -n pgo
Yes, you can execute it on the pgBackRest Pod.
Here is an example. In this case, I'm going to delete my latest backup.
pgbackrest expire
, so please do so at your own discretion 🚨pgo show backup lion
cluster: lion
storage type: local
stanza: db
status: ok
cipher: none
db (current)
wal archive min/max (13-1)
full backup: 20201209-144519F
timestamp start/stop: 2020-12-09 14:45:19 +0000 UTC / 2020-12-09 14:45:29 +0000 UTC
wal start/stop: 000000010000000000000002 / 000000010000000000000002
database size: 31.0MiB, backup size: 31.0MiB
repository size: 3.8MiB, repository backup size: 3.8MiB
backup reference list:
incr backup: 20201209-144519F_20201210-144040I
timestamp start/stop: 2020-12-10 14:40:40 +0000 UTC / 2020-12-10 14:40:50 +0000 UTC
wal start/stop: 000000020000000000000006 / 000000020000000000000006
database size: 31.0MiB, backup size: 221.4KiB
repository size: 3.8MiB, repository backup size: 28.0KiB
backup reference list: 20201209-144519F
kubectl exec -it lion-backrest-shared-repo-84c5fc44d7-mzvms -- pgbackrest expire --set=20201209-144519F_20201210-144040I
WARN: option 'repo1-retention-full' is not set for 'repo1-retention-full-type=count', the repository may run out of space HINT: to retain full backups indefinitely (without warning), set option 'repo1-retention-full' to the maximum. WARN: expiring latest backup 20201209-144519F_20201210-144040I - the ability to perform point-in-time-recovery (PITR) may be affected HINT: non-default settings for 'repo1-retention-archive'/'repo1-retention-archive-type' (even in prior expires) can cause gaps in the WAL.
3. See that it is expired:
pgo show backup lion
cluster: lion storage type: local
stanza: db status: ok cipher: none
db (current)
wal archive min/max (13-1)
full backup: 20201209-144519F
timestamp start/stop: 2020-12-09 14:45:19 +0000 UTC / 2020-12-09 14:45:29 +0000 UTC
wal start/stop: 000000010000000000000002 / 000000010000000000000002
database size: 31.0MiB, backup size: 31.0MiB
repository size: 3.8MiB, repository backup size: 3.8MiB
backup reference list:
Awesome!! Thanks much. Yes - understand all warnings :-) We also have full backups in s3 so ok losing them in the backrest repo.
Any suggestions?
$ kk exec -ti retroelk-prod1-azure-backrest-shared-repo-9cd4f6464-4bcm8 -- pgbackrest expire --repo1-retention-full=7
ERROR: [041]: unable to open file '/backrestrepo/retroelk-backrest-shared-repo/backup/db/backup.info' for write: [28] No space left on device
command terminated with exit code 41
Clear ephemeral files (not WAL logs) and/or resize the PVC.
Yeah - I get nervous deleteing stuff out of backrest repo because I know it does checksums. So what are the ephemeral files that are safe to delete in backrest repo?
Again - I'm not worried about losing backups because I have full backups in S3. My main concern is that I don't want to mess up the cluster so that the backrest repo is non functional. That is actually my current emergency... because the backrest repo is full my primary db disk is rapidly filling up with wals. I'd be fine blowing away the whole repo as I have backups in S3 but I don't want to have to rebuild the cluster or impact the production db in any way.
re resizing PVC I'm also working on that but would be simplest if I could just delete some data from the backup repo and set my retention policy. Actually funny story... the only reason the retention policy is so high on this repo is because I want to keep more backups in S3 and there is no way with the operator (that I know of) to set a different retention policy or different schedule (which would let me set a different retention policy) for local vs s3 backups. I'll have to file a separate feature request on that :-)
Ok managed to delete some files and have the backrest repo with disk space but WAL is still growing on primary db. Is there anything we need to restart on the primary db to make sure WAL files are getting written and cleared off the primary?
Where can I look at the logs for the wal exporter?
Flagging for possible documentation additions around deleting backups as well as potentially adding ability to explicitly delete a backup from the Operator CLI.
An explicit pgo delete backup
command that deletes pgBackRest backups is now added to the Operator and will be in the 4.6 release.
@jkatz please add pgo delete backup
to the new 5+ client
pgo 4.5.0
My pgbackrest-repo disk has filled up because I set my retention policy incorrectly. I tried doing a backup with retention set to lower number in the hope that it would delete old backups first before trying to do the backup, but that failed because I assume it tried to do the backup first and the disk is full. How can I safely remove old backups in the repo to make some space?