garethgeorge / backrest

Backrest is a web UI and orchestrator for restic backup.
GNU General Public License v3.0
850 stars 31 forks source link

Support running backups in parallel #221

Open watn3y opened 3 months ago

watn3y commented 3 months ago

Is your feature request related to a problem? Please describe. Backrest doesn't seem to support running multiple backups simultaneously. Since restic itself supports running backup without exclusively locking the repo, I see no reason why backrest wouldn't.

Describe the solution you'd like A config option that limits parallel backups either globally or on a per-repo basis.

garethgeorge commented 3 months ago

Hmm, thinking through this I can see that the idea of parallel backup options is immediately appealing as it sounds like it should be a speedup -- but I'm interested to think through what the real performance gains will be (and what costs / risks if any may come with concurrent backups):

Things that jump out to me are

I see parallelism as introducing some risks and backrest does aim to be opinionated in places where it can avoid "footguns" e.g. places where a user can accidentally break themself or simply may not stand to gain.

I'm curious how much value it adds to the way you use backrest / how much speedup you're expecting? I need some convincing that there's a strong value add from parallelism & that it'll be a big UX improvement to justify the complexity and risk.

watn3y commented 3 months ago

You make some valid points, but I still believe that in some cases it would be beneficial to run backups in parallel.

For example: I am running backrest on a Server with a 10G Uplink. I have it connected to 2 repos, each with a 1G Link.

Every 15 minutes I run a backup of some smaller files to repo 1. Once a day I back up my larger files to repo 2, this usually takes up to an hour.

While the daily backup to repo 2 is running, backrest doesn't start any of the 4 scheduled backups to repo 1.

An option to run backups to different repos independently of each other would be great to have here.

brandonkal commented 3 months ago

Parallel backups generally will be a speedup. I am running a backup and the bottleneck is the bandwidth+latency of the S3 service, not my upload speed.

Additionally, it would be nice to be able to pause a backup task.

garethgeorge commented 3 months ago

OK, I'm open to this but I think I'm going to consider it low priority for now -- I think my near term focus is continuing to improve, reliability, unblock some workflows by improving hook handling, and add multi-host management as a feature. I'll defer parallel execution for now as it makes some of that more challenging. Looking forward though, architecturally Backrest does have the right concurrency controls in place to make this possible both on the backend and on the networking side.

Additionally, it would be nice to be able to pause a backup task.

Unfortunately restic doesn't support pausing operations BUT it does do content-based hashing for deduplication. If you restart a backup you won't end up using more storage in your repo (/ in many cases I suspect you won't reupload anything? But that might be a question for the restic forum :) ).

watn3y commented 3 months ago

I'm with you on that. Thanks for considering it, and thanks for this great piece of software :)