borgbackup / borg

Deduplicating archiver with compression and authenticated encryption.
https://www.borgbackup.org/
Other
10.73k stars 733 forks source link

Possible bug in pruning logic with keep-weekly and keep-monthly #8221

Closed Craeckie closed 1 month ago

Craeckie commented 1 month ago

Have you checked borgbackup docs, FAQ, and open GitHub issues?

Yes, this is likely related to changes in this pull request: https://github.com/borgbackup/borg/pull/5332

Is this a BUG / ISSUE report or a QUESTION?

BUG

System information. For client/server mode post info for both machines.

Your borg version (borg -V).

Client: 1.2.0 Server: 1.2.8

Operating system (distribution) and version.

Client: Ubuntu Server 22.04.4 LTS Server: Static ARM Binary

Hardware / network configuration, and filesystems used.

Not relevant.

How much data is handled by borg?

Deduplicated: ~1Tb

Full borg commandline that lead to the problem (leave away excludes and passwords)

borg prune --progress --list --keep-weekly 6 --keep-monthly 2

Describe the problem you're observing.

When pruning, I got a different result from what I expected based on the documentation, which states: The rules are applied from secondly to yearly, and backups selected by previous rules do not count towards those of later rules.

today: server-2023-11-16

Expected

I expected that first, only keep-weekly is considered, and afterwards keep-monthly:

server-2023-11-16: weekly 1 server-2023-11-12: weekly 2 server-2023-11-05: weekly 3 server-2023-10-29: weekly 4 server-2023-10-22: weekly 5 server-2023-10-15: weekly 6 server-2023-09-30: monthly 1 server-2023-08-31: monthly 2

Actually

Even though keep-weekly already keeps backups for October 2023, but keep-monthly still keeps the 31. of that month. Eventually, this lead to pruning server-2023-08-31:

server-2023-11-16: weekly 1 server-2023-11-12: weekly 2 server-2023-11-05: weekly 3 server-2023-10-31: monthly 1! server-2023-10-29: weekly 4 server-2023-10-22: weekly 5 server-2023-10-15: weekly 6 server-2023-09-30: monthly 2!

Is there a misunderstanding from my side? If not, this situation could serve as a test case for improving/fixing the prune logic.

Craeckie commented 1 month ago

Another example from today with the same problem including the whole output.

Explanation

Rule monthly: #1 is applied on server-2024-04-30, even though keep-weekly already ensures backups for April. And again, monthly is applied before all of keep-weekly are applied, which I believe contradicts the documentation.

Result

[root@server ~]# borg prune --progress --list --keep-weekly 6 --keep-monthly 2 -a 'server-*' --dry-run ``` Keeping archive (rule: weekly #1): server-2024-05-15 Wed, 2024-05-15 02:34:33 [c6477fdf5306619308a0de891f6a0b8cfd7ba573fe85942142f2fbc7911a0d34] Would prune: server-2024-05-14 Tue, 2024-05-14 02:34:12 [6e8a3e80dfa773599575eb71350eee3319d7036cecaf44d7c2bca5306e3b52f9] Would prune: server-2024-05-13 Mon, 2024-05-13 02:34:26 [4be110a90a61135274a3ffb780bb2d8995055da39c4f4051264e914fa57ef4bb] Keeping archive (rule: weekly #2): server-2024-05-12 Sun, 2024-05-12 02:34:57 [2d80882acc7dc32ca8163126c7bd98d5194fdcfa7e1fffd9c8202e425976aed5] Would prune: server-2024-05-11 Sat, 2024-05-11 02:34:34 [7ffa61643b8fb3650c626019e5c638003d7dacaa09dd52bc5827c0013fb46255] Would prune: server-2024-05-10 Fri, 2024-05-10 02:34:25 [a537a09cb50b9b251c1ceccd1663655cc8edbb05e585166d28a108eb3c364597] Would prune: server-2024-05-09 Thu, 2024-05-09 02:34:10 [4ea38a821da9d03404a8ac19c2fbd2bb89547eb9ee948f8735c914273b7d4910] Would prune: server-2024-05-08 Wed, 2024-05-08 02:34:24 [3994a92bf0ba5ad4f94ef644c5b0319c67ca5db3c3665dd1611be8556b1bbbc0] Would prune: server-2024-05-07 Tue, 2024-05-07 02:33:48 [b6a2fd997ffe68e5ee422c1e7043447a4cf7b9162dd295582a6eef14a558e9c6] Keeping archive (rule: weekly #3): server-2024-05-04 Sat, 2024-05-04 02:33:56 [fcbfc85be120e9c66ef65872d91cd6377b1c9e09d894269c46ea425c5bde72c7] Would prune: server-2024-05-03 Fri, 2024-05-03 02:33:46 [50a7eafd338d727a669b3f0919759423c48bcc62c802f684b9ed21c2a3c52a51] Would prune: server-2024-05-02 Thu, 2024-05-02 02:33:37 [c7b111582922ba35e2a847cfebf95de2cfb38148f3def9f546012deb613bede8] Would prune: server-2024-05-01 Wed, 2024-05-01 02:34:01 [6d3f1f7a8ff815bf5efd8de1ac4c048ee2774c04b02a7a9189dd2cb55317ca65] Keeping archive (rule: monthly #1): server-2024-04-30 Tue, 2024-04-30 02:33:58 [2521252633dd13dbe4d32902b6b4f35769f863cd1c0153924253fd0393c02ab7] Would prune: server-2024-04-29 Mon, 2024-04-29 02:33:48 [cef8f29a6cfbc7718f3ba77018efd2f93d9bca6f0ebbe89cad1aa2ea5746e02d] Keeping archive (rule: weekly #4): server-2024-04-27 Sat, 2024-04-27 02:33:53 [672eeffa5acd3983c6ef18bbf8a2d6882e1cae393501ca213fbec3d07678ba95] Would prune: server-2024-04-26 Fri, 2024-04-26 02:33:48 [d6e6d0513ab330db3391d50c145a94fa6013a8ca2ab5052c7761b78d555b77d4] Would prune: server-2024-04-25 Thu, 2024-04-25 02:34:00 [127b0a220c149c5ae931bc6d75f15d49a8f6f0caa7d90c7c4273e3ff4059cbce] Would prune: server-2024-04-24 Wed, 2024-04-24 02:34:41 [356905ead98e4a116d02ee0ea2c599eb6b8edeab0099d1e9dbf3ebbf9bb59bdb] Would prune: server-2024-04-23 Tue, 2024-04-23 02:33:55 [3e5ee123efe088d719e7e8cb63ffdc95b27149771c5456c456fe17841d9c1c7b] Would prune: server-2024-04-22 Mon, 2024-04-22 02:34:24 [0088f6b3b4413dacc6ac4ee9c2e2f8b3badb3031c8218948f0150e405267abbb] Keeping archive (rule: weekly #5): server-2024-04-20 Sat, 2024-04-20 02:34:08 [3546afeb8ca9ad22090494c0d20c3812e0263ec7bb0c9a00fa0e400aaa9c69b6] Would prune: server-2024-04-19 Fri, 2024-04-19 02:34:22 [e52a99767b20d6af004f3e0b63f2820541675d8ac0d97466af97c158c75bb6ed] Would prune: server-2024-04-18 Thu, 2024-04-18 02:34:25 [74c69b54c3c344118b15fc972aeafed7d90f6f1e481df0390ad9fad25422a1fa] Would prune: server-2024-04-16 Tue, 2024-04-16 02:34:46 [28523c6bef74db61a931f33f03059741eb0d70769258801c4f8bad4dfbb3a14e] Would prune: server-2024-04-15 Mon, 2024-04-15 02:33:53 [ba30ed96dfff04d3941b1ca4310ca41515df062337661513adadc559d6b450a3] Keeping archive (rule: weekly #6): server-2024-04-14 Sun, 2024-04-14 02:33:47 [9b0380b5d5926c7dee47783dd0440ef9f6f36ba72232755aa0337b3c7f709229] Would prune: server-2024-04-13 Sat, 2024-04-13 02:33:47 [fb4e5a3f0a3f3b518790c238f477abff25435b1f93746920db2aa5613a387ef5] Would prune: server-2024-04-12 Fri, 2024-04-12 02:33:43 [96682c7cb29860bdb8f8e552d58506cbb891c05b0fe4ddc42b57f29250988f4c] Would prune: server-2024-04-11 Thu, 2024-04-11 02:33:56 [0d73b91703727adc592f7b87db0d3dc877caa10037c13eb62666f67f5ff023e0] Would prune: server-2024-04-10 Wed, 2024-04-10 02:33:51 [d9878ea921079e6aefc62179f77b04467fdca4108b756f1b4fe727247f376346] Would prune: server-2024-04-09 Tue, 2024-04-09 02:33:52 [efe2def894abeb1b495c27e9c8f2e02a18a95230af21a89bdc5f3b297913739b] Would prune: server-2024-04-08 Mon, 2024-04-08 02:33:52 [ceef47ceae054b59d78eb6f7ba0e71f06694c24951f5ae1157606d111cac88d9] Would prune: server-2024-04-07 Sun, 2024-04-07 02:33:47 [852aa780efdf395d0db1c9e8319a9939e3a2541683d2551fd58556cc3906f80b] Would prune: server-2024-04-06 Sat, 2024-04-06 02:33:41 [8305809341ff7c135eac0be45fdf9e2b497a02d03147b388181fc00daf7fb3c3] Would prune: server-2024-04-05 Fri, 2024-04-05 02:33:48 [4d6ac5cb4b68c1c2a8cc6d11afcc69bfca8e74ee07b3ddfa8c329ca4aa241938] Would prune: server-2024-04-04 Thu, 2024-04-04 02:33:54 [01fd2d49c297e7b6b1a437ac6ecac3273b490b41488a909628cc8180bca727ef] Would prune: server-2024-04-03 Wed, 2024-04-03 02:33:43 [586908413fddbf1d7fd3d53deb5c2a3c01769860f8345c3b73d4872d96079abf] Would prune: server-2024-04-01 Mon, 2024-04-01 02:33:43 [f0b0c9ce782682b227cf1bc8543f2fabff36e197326be9d18b3eb4eae1a1bb0a] Keeping archive (rule: monthly #2): server-2024-03-31 Sun, 2024-03-31 03:03:36 [f12a6fc422707ea535e3c688fde4de3e988fd51679cbcdac9fc3686b27f100b2] Would prune: server-2024-03-30 Sat, 2024-03-30 02:34:19 [3a72272ca462dc1e8b691c0c918c89888cb335496579a6955eee1d280542bed6] Would prune: server-2024-03-29 Fri, 2024-03-29 02:33:51 [35623cb7f639e516c6de24488af56f1e91fdd3e22c5c7399fea6b6e02592d76f] ...
Craeckie commented 1 month ago

I will try to reproduce the behavior on version 1.2.8, will report here once the cache is synchronized.

Craeckie commented 1 month ago

I was able to reproduce the behavior in 1.2.8 (both client and server) with the same result:

borg prune --progress --list --keep-weekly 6 --keep-monthly 2 -a 'server-*' --dry-run ``` Keeping archive (rule: weekly #1): server-2024-05-15 Wed, 2024-05-15 02:34:33 [c6477fdf5306619308a0de891f6a0b8cfd7ba573fe85942142f2fbc7911a0d34] Would prune: server-2024-05-14 Tue, 2024-05-14 02:34:12 [6e8a3e80dfa773599575eb71350eee3319d7036cecaf44d7c2bca5306e3b52f9] Would prune: server-2024-05-13 Mon, 2024-05-13 02:34:26 [4be110a90a61135274a3ffb780bb2d8995055da39c4f4051264e914fa57ef4bb] Keeping archive (rule: weekly #2): server-2024-05-12 Sun, 2024-05-12 02:34:57 [2d80882acc7dc32ca8163126c7bd98d5194fdcfa7e1fffd9c8202e425976aed5] Would prune: server-2024-05-11 Sat, 2024-05-11 02:34:34 [7ffa61643b8fb3650c626019e5c638003d7dacaa09dd52bc5827c0013fb46255] Would prune: server-2024-05-10 Fri, 2024-05-10 02:34:25 [a537a09cb50b9b251c1ceccd1663655cc8edbb05e585166d28a108eb3c364597] Would prune: server-2024-05-09 Thu, 2024-05-09 02:34:10 [4ea38a821da9d03404a8ac19c2fbd2bb89547eb9ee948f8735c914273b7d4910] Would prune: server-2024-05-08 Wed, 2024-05-08 02:34:24 [3994a92bf0ba5ad4f94ef644c5b0319c67ca5db3c3665dd1611be8556b1bbbc0] Would prune: server-2024-05-07 Tue, 2024-05-07 02:33:48 [b6a2fd997ffe68e5ee422c1e7043447a4cf7b9162dd295582a6eef14a558e9c6] Keeping archive (rule: weekly #3): server-2024-05-04 Sat, 2024-05-04 02:33:56 [fcbfc85be120e9c66ef65872d91cd6377b1c9e09d894269c46ea425c5bde72c7] Would prune: server-2024-05-03 Fri, 2024-05-03 02:33:46 [50a7eafd338d727a669b3f0919759423c48bcc62c802f684b9ed21c2a3c52a51] Would prune: server-2024-05-02 Thu, 2024-05-02 02:33:37 [c7b111582922ba35e2a847cfebf95de2cfb38148f3def9f546012deb613bede8] Would prune: server-2024-05-01 Wed, 2024-05-01 02:34:01 [6d3f1f7a8ff815bf5efd8de1ac4c048ee2774c04b02a7a9189dd2cb55317ca65] Keeping archive (rule: monthly #1): server-2024-04-30 Tue, 2024-04-30 02:33:58 [2521252633dd13dbe4d32902b6b4f35769f863cd1c0153924253fd0393c02ab7] Would prune: server-2024-04-29 Mon, 2024-04-29 02:33:48 [cef8f29a6cfbc7718f3ba77018efd2f93d9bca6f0ebbe89cad1aa2ea5746e02d] Keeping archive (rule: weekly #4): server-2024-04-27 Sat, 2024-04-27 02:33:53 [672eeffa5acd3983c6ef18bbf8a2d6882e1cae393501ca213fbec3d07678ba95] Would prune: server-2024-04-26 Fri, 2024-04-26 02:33:48 [d6e6d0513ab330db3391d50c145a94fa6013a8ca2ab5052c7761b78d555b77d4] Would prune: server-2024-04-25 Thu, 2024-04-25 02:34:00 [127b0a220c149c5ae931bc6d75f15d49a8f6f0caa7d90c7c4273e3ff4059cbce] Would prune: server-2024-04-24 Wed, 2024-04-24 02:34:41 [356905ead98e4a116d02ee0ea2c599eb6b8edeab0099d1e9dbf3ebbf9bb59bdb] Would prune: server-2024-04-23 Tue, 2024-04-23 02:33:55 [3e5ee123efe088d719e7e8cb63ffdc95b27149771c5456c456fe17841d9c1c7b] Would prune: server-2024-04-22 Mon, 2024-04-22 02:34:24 [0088f6b3b4413dacc6ac4ee9c2e2f8b3badb3031c8218948f0150e405267abbb] Keeping archive (rule: weekly #5): server-2024-04-20 Sat, 2024-04-20 02:34:08 [3546afeb8ca9ad22090494c0d20c3812e0263ec7bb0c9a00fa0e400aaa9c69b6] Would prune: server-2024-04-19 Fri, 2024-04-19 02:34:22 [e52a99767b20d6af004f3e0b63f2820541675d8ac0d97466af97c158c75bb6ed] Would prune: server-2024-04-18 Thu, 2024-04-18 02:34:25 [74c69b54c3c344118b15fc972aeafed7d90f6f1e481df0390ad9fad25422a1fa] Would prune: server-2024-04-16 Tue, 2024-04-16 02:34:46 [28523c6bef74db61a931f33f03059741eb0d70769258801c4f8bad4dfbb3a14e] Would prune: server-2024-04-15 Mon, 2024-04-15 02:33:53 [ba30ed96dfff04d3941b1ca4310ca41515df062337661513adadc559d6b450a3] Keeping archive (rule: weekly #6): server-2024-04-14 Sun, 2024-04-14 02:33:47 [9b0380b5d5926c7dee47783dd0440ef9f6f36ba72232755aa0337b3c7f709229] Would prune: server-2024-04-13 Sat, 2024-04-13 02:33:47 [fb4e5a3f0a3f3b518790c238f477abff25435b1f93746920db2aa5613a387ef5] Would prune: server-2024-04-12 Fri, 2024-04-12 02:33:43 [96682c7cb29860bdb8f8e552d58506cbb891c05b0fe4ddc42b57f29250988f4c] Would prune: server-2024-04-11 Thu, 2024-04-11 02:33:56 [0d73b91703727adc592f7b87db0d3dc877caa10037c13eb62666f67f5ff023e0] Would prune: server-2024-04-10 Wed, 2024-04-10 02:33:51 [d9878ea921079e6aefc62179f77b04467fdca4108b756f1b4fe727247f376346] Would prune: server-2024-04-09 Tue, 2024-04-09 02:33:52 [efe2def894abeb1b495c27e9c8f2e02a18a95230af21a89bdc5f3b297913739b] Would prune: server-2024-04-08 Mon, 2024-04-08 02:33:52 [ceef47ceae054b59d78eb6f7ba0e71f06694c24951f5ae1157606d111cac88d9] Would prune: server-2024-04-07 Sun, 2024-04-07 02:33:47 [852aa780efdf395d0db1c9e8319a9939e3a2541683d2551fd58556cc3906f80b] Would prune: server-2024-04-06 Sat, 2024-04-06 02:33:41 [8305809341ff7c135eac0be45fdf9e2b497a02d03147b388181fc00daf7fb3c3] Would prune: server-2024-04-05 Fri, 2024-04-05 02:33:48 [4d6ac5cb4b68c1c2a8cc6d11afcc69bfca8e74ee07b3ddfa8c329ca4aa241938] Would prune: server-2024-04-04 Thu, 2024-04-04 02:33:54 [01fd2d49c297e7b6b1a437ac6ecac3273b490b41488a909628cc8180bca727ef] Would prune: server-2024-04-03 Wed, 2024-04-03 02:33:43 [586908413fddbf1d7fd3d53deb5c2a3c01769860f8345c3b73d4872d96079abf] Would prune: server-2024-04-01 Mon, 2024-04-01 02:33:43 [f0b0c9ce782682b227cf1bc8543f2fabff36e197326be9d18b3eb4eae1a1bb0a] Keeping archive (rule: monthly #2): server-2024-03-31 Sun, 2024-03-31 03:03:36 [f12a6fc422707ea535e3c688fde4de3e988fd51679cbcdac9fc3686b27f100b2] Would prune: server-2024-03-30 Sat, 2024-03-30 02:34:19 [3a72272ca462dc1e8b691c0c918c89888cb335496579a6955eee1d280542bed6] ```
ThomasWaldmann commented 1 month ago

Please specify what you think is wrong in your 1.2.8 reproduction.

For me, it looks fine (and it gives the rules that kept the archives, so you can see why these were kept).

Craeckie commented 1 month ago

As I tried to explain in my first post, borg does not behave as described in the documentation, and additionally does not follow my intuition.

In the documentation it says

The rules are applied from secondly to yearly, and backups selected by previous rules do not count towards those of later rules.

But instead, rule monthly #1 is applied for server-2024-04-30 before all weekly rules have been applied. I expect borg to only apply keep-monthly after the last keep-weekly (here: rule: weekly #6).

Additionally, the behavior does not follow my intuition, because keep-weekly already ensures that backups exist for April 2024 (server-2024-04-27, server-2024-04-20, ..). Thus, keep-monthly does not need to applied to April 2024. Borg, however, keeps server-2024-04-30. I would expect borg to only apply keep-monthly after keep-weekly, so server-2024-03-31 would be the first application of keep-monthly. If that was the case, borg would additionally keep the last backup of February 2024.

Therefore, the current behavior results in borg in pruning all backups of February 2024 while keeping an unneded backup (server-2024-04-30). This is contrary to both my intuition and my understanding of the documentation.

I hope this explanations clarifies this bug report.

ThomasWaldmann commented 1 month ago

Guess you misunderstood the docs.

The rules are applied in that order (you can check that in the code, the algorithm processes seconds first, ..., years last), but that does not imply that the resulting (sorted by date) output list is in that order. The order refers to what the code does, not to the backup dates.

You can easily verify that for that 1.2.8 output:

First it picks 6 weekly backups to keep:

Keeping archive (rule: weekly #1):       server-2024-05-15            Wed, 2024-05-15 02:34:33 [c6477fdf5306619308a0de891f6a0b8cfd7ba573fe85942142f2fbc7911a0d34]                                           
Keeping archive (rule: weekly #2):       server-2024-05-12            Sun, 2024-05-12 02:34:57 [2d80882acc7dc32ca8163126c7bd98d5194fdcfa7e1fffd9c8202e425976aed5]                                           
Keeping archive (rule: weekly #3):       server-2024-05-04            Sat, 2024-05-04 02:33:56 [fcbfc85be120e9c66ef65872d91cd6377b1c9e09d894269c46ea425c5bde72c7]                                           
Keeping archive (rule: weekly #4):       server-2024-04-27            Sat, 2024-04-27 02:33:53 [672eeffa5acd3983c6ef18bbf8a2d6882e1cae393501ca213fbec3d07678ba95]                                           
Keeping archive (rule: weekly #5):       server-2024-04-20            Sat, 2024-04-20 02:34:08 [3546afeb8ca9ad22090494c0d20c3812e0263ec7bb0c9a00fa0e400aaa9c69b6]                                           
Keeping archive (rule: weekly #6):       server-2024-04-14            Sun, 2024-04-14 02:33:47 [9b0380b5d5926c7dee47783dd0440ef9f6f36ba72232755aa0337b3c7f709229]

Then it picks 2 (additional) monthly backups to keep:

Keeping archive (rule: monthly #1):      server-2024-04-30            Tue, 2024-04-30 02:33:58 [2521252633dd13dbe4d32902b6b4f35769f863cd1c0153924253fd0393c02ab7]
Keeping archive (rule: monthly #2):      server-2024-03-31            Sun, 2024-03-31 03:03:36 [f12a6fc422707ea535e3c688fde4de3e988fd51679cbcdac9fc3686b27f100b2]

Note: 2024-05-15 would usually count as a monthly backup for May, because it is the last existing backup in that month. But as that was already picked as a weekly backup, borg picks the april and march last backups to keep. So, you can see that it applied the weekly rule first and then the monthly rule.

Craeckie commented 1 month ago

ah thanks for the clarification!

I thought that each of the keep rules effectively increases the time range of keeping backups, but instead these time ranges mostly overlap. Only, if two rules would apply to the same backup, the later rule is "postponed" to the next matching backup. I'm not sure, how the documentation could be improved to clarify this for other people with the same misconception. If I have an idea, I will write it here.