Closed Craeckie closed 1 month ago
Another example from today with the same problem including the whole output.
Rule monthly: #1
is applied on server-2024-04-30
, even though keep-weekly already ensures backups for April. And again, monthly is applied before all of keep-weekly are applied, which I believe contradicts the documentation.
I will try to reproduce the behavior on version 1.2.8, will report here once the cache is synchronized.
I was able to reproduce the behavior in 1.2.8 (both client and server) with the same result:
Please specify what you think is wrong in your 1.2.8 reproduction.
For me, it looks fine (and it gives the rules that kept the archives, so you can see why these were kept).
As I tried to explain in my first post, borg does not behave as described in the documentation, and additionally does not follow my intuition.
In the documentation it says
The rules are applied from secondly to yearly, and backups selected by previous rules do not count towards those of later rules.
But instead, rule monthly #1
is applied for server-2024-04-30
before all weekly rules have been applied. I expect borg to only apply keep-monthly after the last keep-weekly (here: rule: weekly #6
).
Additionally, the behavior does not follow my intuition, because keep-weekly already ensures that backups exist for April 2024 (server-2024-04-27
, server-2024-04-20
, ..). Thus, keep-monthly does not need to applied to April 2024. Borg, however, keeps server-2024-04-30
.
I would expect borg to only apply keep-monthly after keep-weekly, so server-2024-03-31
would be the first application of keep-monthly. If that was the case, borg would additionally keep the last backup of February 2024.
Therefore, the current behavior results in borg in pruning all backups of February 2024 while keeping an unneded backup (server-2024-04-30
). This is contrary to both my intuition and my understanding of the documentation.
I hope this explanations clarifies this bug report.
Guess you misunderstood the docs.
The rules are applied in that order (you can check that in the code, the algorithm processes seconds first, ..., years last), but that does not imply that the resulting (sorted by date) output list is in that order. The order refers to what the code does, not to the backup dates.
You can easily verify that for that 1.2.8 output:
First it picks 6 weekly backups to keep:
Keeping archive (rule: weekly #1): server-2024-05-15 Wed, 2024-05-15 02:34:33 [c6477fdf5306619308a0de891f6a0b8cfd7ba573fe85942142f2fbc7911a0d34]
Keeping archive (rule: weekly #2): server-2024-05-12 Sun, 2024-05-12 02:34:57 [2d80882acc7dc32ca8163126c7bd98d5194fdcfa7e1fffd9c8202e425976aed5]
Keeping archive (rule: weekly #3): server-2024-05-04 Sat, 2024-05-04 02:33:56 [fcbfc85be120e9c66ef65872d91cd6377b1c9e09d894269c46ea425c5bde72c7]
Keeping archive (rule: weekly #4): server-2024-04-27 Sat, 2024-04-27 02:33:53 [672eeffa5acd3983c6ef18bbf8a2d6882e1cae393501ca213fbec3d07678ba95]
Keeping archive (rule: weekly #5): server-2024-04-20 Sat, 2024-04-20 02:34:08 [3546afeb8ca9ad22090494c0d20c3812e0263ec7bb0c9a00fa0e400aaa9c69b6]
Keeping archive (rule: weekly #6): server-2024-04-14 Sun, 2024-04-14 02:33:47 [9b0380b5d5926c7dee47783dd0440ef9f6f36ba72232755aa0337b3c7f709229]
Then it picks 2 (additional) monthly backups to keep:
Keeping archive (rule: monthly #1): server-2024-04-30 Tue, 2024-04-30 02:33:58 [2521252633dd13dbe4d32902b6b4f35769f863cd1c0153924253fd0393c02ab7]
Keeping archive (rule: monthly #2): server-2024-03-31 Sun, 2024-03-31 03:03:36 [f12a6fc422707ea535e3c688fde4de3e988fd51679cbcdac9fc3686b27f100b2]
Note: 2024-05-15 would usually count as a monthly backup for May, because it is the last existing backup in that month. But as that was already picked as a weekly backup, borg picks the april and march last backups to keep. So, you can see that it applied the weekly rule first and then the monthly rule.
ah thanks for the clarification!
I thought that each of the keep rules effectively increases the time range of keeping backups, but instead these time ranges mostly overlap. Only, if two rules would apply to the same backup, the later rule is "postponed" to the next matching backup. I'm not sure, how the documentation could be improved to clarify this for other people with the same misconception. If I have an idea, I will write it here.
Have you checked borgbackup docs, FAQ, and open GitHub issues?
Yes, this is likely related to changes in this pull request: https://github.com/borgbackup/borg/pull/5332
Is this a BUG / ISSUE report or a QUESTION?
BUG
System information. For client/server mode post info for both machines.
Your borg version (borg -V).
Client: 1.2.0 Server: 1.2.8
Operating system (distribution) and version.
Client: Ubuntu Server 22.04.4 LTS Server: Static ARM Binary
Hardware / network configuration, and filesystems used.
Not relevant.
How much data is handled by borg?
Deduplicated: ~1Tb
Full borg commandline that lead to the problem (leave away excludes and passwords)
borg prune --progress --list --keep-weekly 6 --keep-monthly 2
Describe the problem you're observing.
When pruning, I got a different result from what I expected based on the documentation, which states: The rules are applied from secondly to yearly, and backups selected by previous rules do not count towards those of later rules.
today: server-2023-11-16
Expected
I expected that first, only keep-weekly is considered, and afterwards keep-monthly:
server-2023-11-16: weekly 1 server-2023-11-12: weekly 2 server-2023-11-05: weekly 3 server-2023-10-29: weekly 4 server-2023-10-22: weekly 5 server-2023-10-15: weekly 6 server-2023-09-30: monthly 1 server-2023-08-31: monthly 2
Actually
Even though keep-weekly already keeps backups for October 2023, but keep-monthly still keeps the 31. of that month. Eventually, this lead to pruning
server-2023-08-31
:server-2023-11-16: weekly 1 server-2023-11-12: weekly 2 server-2023-11-05: weekly 3 server-2023-10-31: monthly 1! server-2023-10-29: weekly 4 server-2023-10-22: weekly 5 server-2023-10-15: weekly 6 server-2023-09-30: monthly 2!
Is there a misunderstanding from my side? If not, this situation could serve as a test case for improving/fixing the prune logic.