garethgeorge / backrest

Backrest is a web UI and orchestrator for restic backup.
GNU General Public License v3.0
1.64k stars 43 forks source link

Prune operations not cleared automatically from plan history (again) #530

Open modem opened 2 weeks ago

modem commented 2 weeks ago

Describe the bug The prune operation list should be cleared after 30 days, from what I understand, but in my case I have entries since early June. This issue has already been addressed in the past in https://github.com/garethgeorge/backrest/issues/248.

I wonder if the fix got lost, or if there's any issue in my repos, which has been created long time ago.

Screenshots Here's a print screen from one of my repos: image Another one has even more entries.

Expected behavior List of prune operation to show the ones done in the last 30 days.

Platform Info

garethgeorge commented 2 weeks ago

Interesting — seeing your point, I made a change here such that prune and check operations are kept for up to a year with the expectation that they’re typically run pretty infrequently (weekly or monthly).

I might need to either look at making this configurable, or doing some sort of sliding window e.g. “the last year OR up to 100 operations of a given type per repo”.

For a bit of context, the original reason for the change was to allow “last run” scheduling to work— it needs the retention to be long enough that it can find the last time the task ran to determine when to run again. The simple option I went with here was just to make this a year, but that’s perhaps too long.

In the meantime— I would recommend considering whether you really want to run prune and check daily. Perhaps a weekly cadence would work similarly well? But agree that I need to limit this.

modem commented 2 weeks ago

Thanks for the context Gareth. I have the prune operation scheduled every 7 days, but sometimes I need to trigger it manually when I'm running out of space. And since that data is gone already (not part of any snapshot), it makes sense to me to clean the occupied space. I don't have a good understanding on the scheduling, but for me the last run makes sense to be done based on the last backup operation. I also don't see a reason to keep a long prune history, it brings no added value in the long run... Maybe it could be linked with a new system operations prune policy, defined per plan...

garethgeorge commented 1 week ago

Agree -- your use case / way of configuring it is definitely a valid one, just very different from the policies I use so I hadn't considered it :)

The scheduling logic is based on the last operation of a given type to ensure operations aren't skipped, so I think I need to update the garbage collection to just do some grouping by repo, plan, and type, and keep the latest of each (or something of that sort).