EnterpriseDB / barman

Barman - Backup and Recovery Manager for PostgreSQL
https://www.pgbarman.org/
GNU General Public License v3.0
2.03k stars 191 forks source link

Backup was deleted during recovery process #200

Closed saifulmuhajir closed 2 weeks ago

saifulmuhajir commented 5 years ago

We have a server with retention policy for 10 days: retention_policy: RECOVERY WINDOW OF 10 DAYS

And here's the list-backup command shows:

$ barman list-backup db-test
db-test 20190223T042124 - Sat Feb 23 13:00:28 2019 - Size: 2.4 TiB - WAL Size: 29.8 GiB
db-test 20190220T052453 - Wed Feb 20 14:01:43 2019 - Size: 2.4 TiB - WAL Size: 41.7 GiB
db-test 20190216T105941 - Sat Feb 16 19:43:34 2019 - Size: 2.4 TiB - WAL Size: 43.9 GiB
db-test 20190211T205100 - Sat Feb 12 04:51:30 2019 - Size: 2.4 TiB - WAL Size: 25.9 GiB

We tried to restore to a remote server as follows:

$ barman recover db-test 20190211T205100 --remote-ssh-command "ssh postgres@destinationserver" /postgres/data96/
Processing xlog segments from streaming for db-test
        000000050000353100000041
Starting remote restore for server db-test using backup 20190211T205100
Destination directory: /postgres/data96/
Copying the base backup.
ERROR: Failure copying base backup: data transfer failure
rsync error:
rsync: change_dir "/barman/db-test/base/20190211T205100/data/" failed: No such file or directory (2)
rsync error: errors selecting input/output files, dirs (code 3) at flist.c(2118) [sender=3.1.2]

After some hours, this morning we found out that the recovery process was failed because the backup scheduled was running and the data we are restoring (oldest) was deleted.

$ barman list-backup db-test
db-test 20190227T050837 - STARTED
db-test 20190223T042124 - Sat Feb 23 13:00:28 2019 - Size: 2.4 TiB - WAL Size: 29.8 GiB
db-test 20190220T052453 - Wed Feb 20 14:01:43 2019 - Size: 2.4 TiB - WAL Size: 41.7 GiB
db-test 20190216T105941 - Sat Feb 16 19:43:34 2019 - Size: 2.4 TiB - WAL Size: 43.9 GiB

It would be good if barman recover process touch a status file regarding the backup-id. When auto-delete based on retention policy or barman delete command is executed, the file is checked and if it is being used the delete process should fail.

amenonsen commented 3 years ago

That makes sense. We'll look at this in conjunction with the keep feature (to override the retention policy).