garethgeorge / backrest

Backrest is a web UI and orchestrator for restic backup.
GNU General Public License v3.0
1.18k stars 37 forks source link

repo gets locked too often preventing simultaneous backups #244

Open AlexZorzi opened 5 months ago

AlexZorzi commented 5 months ago

Describe the bug Unlike restic cli, backrest default behavior is to lock the repo, might be due to restic forget(?)

To Reproduce Steps to reproduce the behavior:

  1. Start a backup on one Host (in this example my repo is a http restic-server)
  2. On the second host start another backup
  3. Second host finds a lock on the repo and isn't able to finish the backup

This is fine on small setups since by spanning the cronjobs far enough it leaves time for restic to finish backup and remove the lock. With a high amount of hosts or with frequent backups this ends up failing all the backups.

I guess the issue is that backrest is running restic forget for every backup. And restic forget has to lock the repo to be safe, so a solution could be to move restic forget only in a maintenance cronjob and not at each backup.

Expected behavior For backups to run on multiple hosts at the same time and only lock the repo when running a maintenance job

Platform Info

AlexZorzi commented 5 months ago

The Auto Unlock option for the repo is not a solution since this would make the restic forget command unsafe. My solution is to lower the amount of times restic forget runs all together, while remaining safe

watn3y commented 5 months ago

It seems like currently we run restic forget after every backup and restic prune if the last run is older than Max Frequency Days

One possible solution would be to run forget and prune on a set schedule instead of after every backup.

For example:

Of course, one option is to just use a separate repo for each host. But deduplication is nice and by using a single repo we get to take advantage of it :)

Also, if #221 ever gets implemented, the current design would potentially prevent backups (on the same host) from running in parallel.

garethgeorge commented 5 months ago

This is tricky, agreed re: more flexible scheduling policies being the long term fix -- this is likely a little ways out. I think for now multiple repos (or careful selection of backup schedules) is the correct way to handle this.

I think if using Backrest to backup a large number of systems the right approach for now would be to use retention policy = none and to create a cron script on a single host that does your forget and prune operations.

But this is a good bug report -- it's definitely a goal that this should just work entirely within Backrest (e.g. without requiring external scripts).