laurent22 / rsync-time-backup

Time Machine style backup with rsync.
3.38k stars 446 forks source link

Expiration strategy not deleting oldest backup #175

Open yorkday opened 4 years ago

yorkday commented 4 years ago

I've recently adjusted my backup strategy to prune some older backups, but I noticed that for some reason it was not pruning the very first backup taken.

I am running the latest version (released 18 days ago).

See below, after reducing the strategy from 90:30 to 30:30 and adding a 365:0 strategy (delete backups older than 365 days), for some reason the backup from 2018-05-31-000059 has been retained (others older than 365 were removed).

Old Strategy: "1:1 90:30"

Directories:

drwx------ 78 admin users 16384 Oct 17 00:03 .
drwx------  6 admin users  4096 Dec 13  2018 ..
d---------  4 root  root   4096 May  7  2018 2018-05-31-000059
d---------  4 root  root   4096 May  7  2018 2018-06-30-000058
d---------  4 root  root   4096 May  7  2018 2018-07-30-000038
d---------  4 root  root   4096 May  7  2018 2018-08-29-000036
d---------  4 root  root   4096 Sep 28  2018 2018-09-29-000056
d---------  4 root  root   4096 Sep 28  2018 2018-10-31-000049
d---------  4 root  root   4096 Sep 28  2018 2018-12-01-000101
d---------  4 root  root   4096 Sep 28  2018 2019-01-05-000130
d---------  4 root  root   4096 Sep 28  2018 2019-02-04-000105
d---------  4 root  root   4096 Sep 28  2018 2019-03-06-000108
d---------  4 root  root   4096 Sep 28  2018 2019-04-06-000117
d---------  4 root  root   4096 Sep 28  2018 2019-05-06-000117
d---------  4 root  root   4096 Sep 28  2018 2019-06-06-000101
d---------  4 root  root   4096 Sep 28  2018 2019-07-07-000104
d---------  4 root  root   4096 Sep 28  2018 2019-07-19-000118
d---------  4 root  root   4096 Sep 28  2018 2019-07-21-000101
d---------  4 root  root   4096 Sep 28  2018 2019-07-22-000147
d---------  4 root  root   4096 Sep 28  2018 2019-07-24-000126
d---------  4 root  root   4096 Sep 28  2018 2019-07-25-000126
d---------  4 root  root   4096 Sep 28  2018 2019-07-27-000126
d---------  4 root  root   4096 Sep 28  2018 2019-07-28-000128
d---------  4 root  root   4096 Sep 28  2018 2019-07-29-000130
d---------  4 root  root   4096 Sep 28  2018 2019-07-30-000145
d---------  4 root  root   4096 Sep 28  2018 2019-08-01-000128
d---------  4 root  root   4096 Sep 28  2018 2019-08-02-000135
d---------  4 root  root   4096 Sep 28  2018 2019-08-03-000142
d---------  4 root  root   4096 Sep 28  2018 2019-08-05-000126
d---------  4 root  root   4096 Sep 28  2018 2019-08-06-000141
d---------  4 root  root   4096 Sep 28  2018 2019-08-08-000128
d---------  4 root  root   4096 Sep 28  2018 2019-08-09-000151
d---------  4 root  root   4096 Sep 28  2018 2019-08-11-000126
d---------  4 root  root   4096 Sep 28  2018 2019-08-12-000136
d---------  4 root  root   4096 Sep 28  2018 2019-08-14-000127
d---------  4 root  root   4096 Sep 28  2018 2019-08-16-000138
d---------  4 root  root   4096 Sep 28  2018 2019-08-18-000131
d---------  4 root  root   4096 Sep 28  2018 2019-08-20-000126
d---------  4 root  root   4096 Sep 28  2018 2019-08-22-000129
d---------  4 root  root   4096 Sep 28  2018 2019-08-24-000130
d---------  4 root  root   4096 Sep 28  2018 2019-08-26-000132
d---------  4 root  root   4096 Sep 28  2018 2019-08-28-000132
d---------  4 root  root   4096 Sep 28  2018 2019-08-29-000132
d---------  4 root  root   4096 Sep 28  2018 2019-08-30-000157
d---------  4 root  root   4096 Sep 28  2018 2019-09-01-000129
d---------  4 root  root   4096 Sep 28  2018 2019-09-02-000130
d---------  4 root  root   4096 Sep 28  2018 2019-09-03-000150
d---------  4 root  root   4096 Sep 28  2018 2019-09-04-000242
d---------  4 root  root   4096 Sep 28  2018 2019-09-06-000158
d---------  4 root  root   4096 Sep 28  2018 2019-09-07-014735
d---------  4 root  root   4096 Sep 28  2018 2019-09-09-000137
d---------  4 root  root   4096 Sep 28  2018 2019-09-10-000142
d---------  4 root  root   4096 Sep 28  2018 2019-09-11-000202
d---------  4 root  root   4096 Sep 28  2018 2019-09-12-000332
d---------  4 root  root   4096 Sep 28  2018 2019-09-14-000145
d---------  4 root  root   4096 Sep 28  2018 2019-09-16-000144
d---------  4 root  root   4096 Sep 28  2018 2019-09-17-000214
d---------  4 root  root   4096 Sep 28  2018 2019-09-19-000148
d---------  4 root  root   4096 Sep 28  2018 2019-09-20-000237
d---------  4 root  root   4096 Sep 28  2018 2019-09-22-000155
d---------  4 root  root   4096 Sep 28  2018 2019-09-23-000230
d---------  4 root  root   4096 Sep 28  2018 2019-09-25-000147
d---------  4 root  root   4096 Sep 28  2018 2019-09-26-000217
d---------  4 root  root   4096 Sep 28  2018 2019-09-27-000224
d---------  4 root  root   4096 Sep 28  2018 2019-09-29-000132
d---------  4 root  root   4096 Sep 28  2018 2019-09-30-000232
d---------  4 root  root   4096 Sep 28  2018 2019-10-02-000151
d---------  4 root  root   4096 Sep 28  2018 2019-10-04-000158
d---------  5 root  root   4096 Oct  4 15:28 2019-10-06-000134
d---------  5 root  root   4096 Oct  4 15:28 2019-10-08-000257
d---------  5 root  root   4096 Oct  4 15:28 2019-10-10-000217
d---------  5 root  root   4096 Oct  4 15:28 2019-10-11-000218
d---------  5 root  root   4096 Oct  4 15:28 2019-10-12-000356
d---------  5 root  root   4096 Oct  4 15:28 2019-10-14-000226
d---------  5 root  root   4096 Oct  4 15:28 2019-10-15-000806
d---------  5 root  root   4096 Oct  4 15:28 2019-10-16-000253
d---------  5 root  root   4096 Oct  4 15:28 2019-10-17-000224

New Strategy: "1:1 30:30 365:0"

Directories:

drwx------ 36 admin users 16384 Oct 17 20:20 .
drwx------  6 admin users  4096 Dec 13  2018 ..
d---------  4 root  root   4096 May  7  2018 2018-05-31-000059
d---------  4 root  root   4096 Sep 28  2018 2018-10-31-000049
d---------  4 root  root   4096 Sep 28  2018 2018-12-01-000101
d---------  4 root  root   4096 Sep 28  2018 2019-01-05-000130
d---------  4 root  root   4096 Sep 28  2018 2019-02-04-000105
d---------  4 root  root   4096 Sep 28  2018 2019-03-06-000108
d---------  4 root  root   4096 Sep 28  2018 2019-04-06-000117
d---------  4 root  root   4096 Sep 28  2018 2019-05-06-000117
d---------  4 root  root   4096 Sep 28  2018 2019-06-06-000101
d---------  4 root  root   4096 Sep 28  2018 2019-07-07-000104
d---------  4 root  root   4096 Sep 28  2018 2019-08-06-000141
d---------  4 root  root   4096 Sep 28  2018 2019-09-06-000158
d---------  4 root  root   4096 Sep 28  2018 2019-09-19-000148
d---------  4 root  root   4096 Sep 28  2018 2019-09-20-000237
d---------  4 root  root   4096 Sep 28  2018 2019-09-22-000155
d---------  4 root  root   4096 Sep 28  2018 2019-09-23-000230
d---------  4 root  root   4096 Sep 28  2018 2019-09-25-000147
d---------  4 root  root   4096 Sep 28  2018 2019-09-26-000217
d---------  4 root  root   4096 Sep 28  2018 2019-09-27-000224
d---------  4 root  root   4096 Sep 28  2018 2019-09-29-000132
d---------  4 root  root   4096 Sep 28  2018 2019-09-30-000232
d---------  4 root  root   4096 Sep 28  2018 2019-10-02-000151
d---------  4 root  root   4096 Sep 28  2018 2019-10-04-000158
d---------  5 root  root   4096 Oct  4 15:28 2019-10-06-000134
d---------  5 root  root   4096 Oct  4 15:28 2019-10-08-000257
d---------  5 root  root   4096 Oct  4 15:28 2019-10-10-000217
d---------  5 root  root   4096 Oct  4 15:28 2019-10-11-000218
d---------  5 root  root   4096 Oct  4 15:28 2019-10-12-000356
d---------  5 root  root   4096 Oct  4 15:28 2019-10-14-000226
d---------  5 root  root   4096 Oct  4 15:28 2019-10-15-000806
d---------  5 root  root   4096 Oct  4 15:28 2019-10-16-000253
d---------  5 root  root   4096 Oct  4 15:28 2019-10-17-000224
d---------  5 root  root   4096 Oct  4 15:28 2019-10-17-201948
kapitainsky commented 4 years ago

It is deliberate decision of the pruning logic to always retain the oldest (first) backup - the reasoning behind it was that many users backup their system and feeling that they have all their data safe "free some disk space". And then discover that their backup has been pruned in the way that it is gone.

https://github.com/laurent22/rsync-time-backup/blob/da904fe66ce384ff3f844fdfc81b6a4d95410d9a/rsync_tmbackup.sh#L121-L126

It is safe to simply rm -rf oldest backup if you don't need it.

yorkday commented 4 years ago

I don't understand this reasoning. There is nothing special about the first backup, it is just a backup point. You can prune any backup, and all remaining backup points are still valid due to hard links. By retaining the first backup, this does not comply with the backup strategy specified.

kapitainsky commented 4 years ago

I fixed prunning logic and introduced this. I know it can be done differently - but I preferred this way. In perfect world it should be configurable but no time atm to modify it.

yorkday commented 4 years ago

I appreciate you fixing the pruning logic which was not working, but this feature that works for you, means the backup strategy now does not work as specified. I think it should be commented out and if it is wanted, should be added as a separate flag. At a minimum also, changes in functionality should be noted in the readme (ie. strategy will now always keep first backups).

I also think you may have misunderstood the value of a "first" backup. Due to hard links, there is nothing special about it. You can prune any backup to your heart's content, and hard linking will always ensure every backup point is still valid. The only loss in pruning a backup is inability to recover to that point, it doesn't break any other backup.

kapitainsky commented 4 years ago

Of course it has nothing to do with preserving hard links - this is clear.

It is more "fool proof" safety measure. In my business environment I saw many times situation when users backed up their system (not necessarily using this script). And as soon as done deleted stuff from it as now all was "safe" in backup. After some time when restoration was needed it came often as a big surprise that their data was gone due to retention policies. So now when I configure users backups I always keep the oldest one (unless there is no space on backup medium)

This logic was tested for months and months by various people (#121) and users were ok with it.

You are right that for power users it is not needed and I agree ideally should be parametrized but I did not have time to make it perfect. It solved much bigger issue present before and it was main focus.

And as always there's more than one way to skin a cat.

yorkday commented 4 years ago

By keeping the first backup to save users and make it "fool proof", I think you're perpetuating the perception that the backup is an archive mechanism, when it's not.

Keeping the oldest backup will only save you if the file you deleted and want back is available at that recovery point. Any files created after that point, then deleted are not recoverable unless the granularity of your recovery points is sufficient - and you notice the file needs to be recovered before the recovery point is pruned out.

I understand what you want to achieve, I don't agree your solution actually solves the problem you're trying to solve correctly, but my biggest issue is that in fixing a specific bug, you modified a separate specification (backup strategy) to suit your needs so that it does not operate as the user specifies.

kapitainsky commented 4 years ago

I take your point and agree in principle but given that before retention policy was totally not doing what user specified I still consider it step into right direction.

Definitely it is not the final version and it will be modified by somebody to make it better.

SimonHeimberg commented 4 years ago

current work around: create an empty directory with a name much older than your oldest backup (like '1900-01-01-000000'.

yorkday commented 4 years ago

Just confirming this is still an open issue. Workaround suggested above by @SimonHeimberg does workaround, but code should still be fixed to prune as per command (it should not assume oldest backup should be kept at all times).