garethgeorge / backrest

Backrest is a web UI and orchestrator for restic backup.
GNU General Public License v3.0
1.11k stars 37 forks source link

Make backrest suitable for laptop #372

Open pmozzati opened 2 months ago

pmozzati commented 2 months ago

Usage description I'd like to use backrest on my laptop. As many home laptops, I guess, the typical usage is to turn it on when necessary. So the machine is subject to frequent reboot. Since backrest always initializes its backup schedule relative to the startup time, It could possilby never start a backup.

Solution proposed I'd like an option that allow backrest to check its last run and take a new snapshot if none has been taken within the last set time span (a day, a week or so...) When I used rclone to backup my data, I wrote a systemd timer unit that executes a script at every boot. If the backup was completed successfully it wrote a timestamp in a file. On the next reboot, the script checked the timestamp and, if more than a day have passed, it took a new snapshot.

garethgeorge commented 2 months ago

Posting here to ack the feature request and the interest I'm seeing on the bug -- this is definitely something that makes sense to support for laptop users.

My initial thoughts are to support this with a simple enum on the schedule to specify whether it's relative to the last run of the schedule OR whether it's relative to the time the task is scheduled (e.g. backrest startup) https://github.com/garethgeorge/backrest/blob/fe0e2b9d5ee43367b0142b7d9df597bf8d4ccde1/proto/v1/config.proto#L121-L128 .

I'm wondering if it may still make sense to ensure that tasks that are scheduled relative to the last run perhaps have a randomized start delay of 0-5 minutes after the process starts to avoid the thundering herd problem on boot.

jburnham commented 2 months ago

I think Anacron works in a similar manner as the systemd timers by storing the time of the last run to use as a basis for when it next needs to run. The Anacrontab also allows you to specify a "delay in minutes" before execution to prevent that thundering herd problem you mentioned, and likely should be supported here.

Is there a particular reason for a need to potentially support both "relative to the last run" and "relative to the time the task is scheduled?" I'm not currently able to understand the difference in how tasks would run under each setting.

As long as the next scheduled job is run at least after the next scheduled runtime (to satisfy the max frequency setting), the next time a laptop powers up from sleep or from shutdown, it should run the late scheduled job and schedule another one at that time. That or the orchestrator loop just is constantly looking for plans that have last run times older than what is expected to satisfy the max frequency setting. I think you lose some of the UI niceness of showing specifically when the next job will run. I don't personally need that, or it can reflect an "expected" next run time.

I have pre and post hooks that submit to healthchecks.io. As long as it runs once an hour (my schedule), it shouldn't complain. I do have a grace period of like 3 days as I expect backups to stop working during the weekend. This allows a backup to run on Monday which refreshes the check. Any CONDITION_ANY_ERROR sends a /fail immediately to my check, so I wouldn't normally have to wait 3 days before finding out I'm not getting any backups. I pause the check if I expect the laptop to not be running for any extended period of time.

I admit I have not yet encountered this issue as I only today switched my laptop to using Backrest to run the scheduled jobs and only saw this issue after. I previously was using a Launchd agent with a 3600 StartInterval to run the restic backup manually, and using Backrest as just a UI to browse my snapshots.

pmozzati commented 2 months ago

I think Anacron works in a similar manner as the systemd timers by storing the time of the last run to use as a basis for when it next needs to run. The Anacrontab also allows you to specify a "delay in minutes" before execution to prevent that thundering herd problem you mentioned, and likely should be supported here.

AFAIK on systemd based Linux distro, cron is an interface to systemd timer unit. So it is expected that systemd can perform any of the cron or anacron operations.

I also agree with adding a delay before the backup starts.

I'm not an IT expert and I also am interested in understanding the differences between the two different approaches to backup scheduling (by startup or last run) I think the first one is more useful to define a specific execution time (let's think of a production server that take a snapshot when no one is using its services - eg. at a certain hour during the night -, while the second one is better when it comes to take at least one snapshot (let's say once a day for example) as soon as possible, if none has been taken within a specified time span.
But, as said, I may be wrong since I don't actually know the internal mechanism of backrest.

VFansss commented 2 months ago

This is basically the only things that make me raise an eyebrow: could be usable on a NAS that never shut down ideally but is quite dangerous if backrest is used on a Desktop pc.

From the date that I've installed Backrest (14th of July) it never started a daily backup!

Off course I'm shutting down my PC and at the next startup is simply not considering that's late for a previous scheduling, and create a new one for tomorrow (that never come, because when I turn on PC tomorrow it will schedule one at the day after tomorrow:

immagine

I don't know what could be a good approach to solve this, but I guess there would be a way to check if from the last backup day a "scheduling tick" should have been happened. If so, start the backup.

I'm wondering if it may still make sense to ensure that tasks that are scheduled relative to the last run perhaps have a randomized start delay of 0-5 minutes after the process starts to avoid the thundering herd problem on boot.

I don't think that, on a PC/laptop use case, 5 minute delay is a big issue. Certainly not like a daily backup that never start!

garethgeorge commented 3 weeks ago

Implemented some initial support for "relative scheduling" (e.g. in relation to the last run of a task) in https://github.com/garethgeorge/backrest/pull/439 .

I'll probably refine this a bit in a followup as I'm noticing that the number of modes I've now added clutters the scheduling UI / makes it hard to pick a sane default. Look like this at the moment:

image

which is just too many options :)

I'll followup with work to filter down that set to a reasonable spread that covers most use cases. Also interested in input re: which of these options are useful. I may simply deduplicate the "every N hours" and "every N days" options, only one of those are necessary.

Also planning a bit of followup work to implement a startup delay for tasks to address the "thundering herd" issue discussed earlier, and with that done I expect this support to ship in 0.15.0.

pmozzati commented 2 weeks ago

Really thanks for your effort in development. In my opinion, in the scheduling UI, you can safely eliminate the "every N days" option ("every N hours" could be fine) , while it should be useful when working with prune and forget Alternatively, have you thought to a drop down lists? I mean, it could be something like:

Backup schedule: every [Input value; 0 = disabled] (hours; days) relative to (startup time, last run)

Where rounded brackets contain the drop down list's options. Square brackets is an input field, 0 means disabled. Hope you understood despite my bad english.