laurent22 / rsync-time-backup

Time Machine style backup with rsync.
3.38k stars 446 forks source link

Expiration strategy deleting backups that should be kept #121

Closed yorkday closed 4 years ago

yorkday commented 6 years ago

Hi there,

I advised of an issue with the expiration strategy, however the issue was closed when the branch was merged to the master: https://github.com/laurent22/rsync-time-backup/issues/105

Because the expiration strategy uses an absolute timestamp value to calculate the age between backups, it deletes backups that don't match the exact durations required.

The 2 examples below show different ways rsync_time_backup removes backups that would be expected to be kept..

I do not like the current expiration strategy as it uses an absolute value of duration based on timestamp seconds. I believe if there is a backup taken at 11pm on day, and 2am the next day, and the user wishes to keep daily backups, both should be retained as the daily backup points if they are the only backups taken on each day.

Preferred Solution: Rsync_time_backup should retain backups for days, weeks, months and years based on absolute values for those time periods, not based on number of seconds from current backup point. I prefer the method used by restic (http://restic.readthedocs.io/en/latest/060_forget.html). This allows users to specify the number of days, weeks, months or years to retain. This way users can retain the last backup for a given month, rather than a fixed window like 30 days, which in some cases could hold 2 backups for a month (say on the 1st and the 31st of the month) or no backups for a month (e.g. February which has 28/29 days).

Example 1 - Daily backups removed: Command used: rsync_tmbackup.sh ./source/ ./target/

Before:

drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-13-000010
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-14-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-15-000008
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-16-000007
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-17-000006
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-18-000005
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-190009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-194507
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-194725
-rw-r--r--   1 user  staff     0 19 May 18:59 backup.marker
lrwxr-xr-x   1 user  staff    17 19 May 19:47 latest -> 2018-05-19-194725

After: Note: Backups from the 17th, 15th and 13th were deleted as they were not exactly 24 hours apart as defined in the seconds since the last backup.

drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-14-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-16-000007
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-18-000005
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-190009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-194507
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-194725
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-200316
-rw-r--r--   1 user  staff     0 19 May 18:59 backup.marker
lrwxr-xr-x   1 user  staff    17 19 May 20:03 latest -> 2018-05-19-200316

Example 2 - Monthly backups removed with 30 day strategy: Command Used: rsync_tmbackup.sh --strategy "1:1 30:30" ./source/ ./target/

Before:

drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-01-31-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-02-01-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-02-28-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-03-01-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-03-31-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-04-01-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-04-30-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-01-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-194507
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-194725
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-200316
-rw-r--r--   1 user  staff     0 19 May 18:59 backup.marker
lrwxr-xr-x   1 user  staff    17 19 May 20:03 latest -> 2018-05-19-200316

After: Note: Backups from January and February were removed altogether, despite them being the only backup points for those months.

drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-03-01-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-03-31-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-04-30-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-01-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-194507
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-194725
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-200316
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-202723
-rw-r--r--   1 user  staff     0 19 May 18:59 backup.marker
lrwxr-xr-x   1 user  staff    17 19 May 20:27 latest -> 2018-05-19-202723
yorkday commented 6 years ago

Unfortunately I think there is a serious bug with the way backup expiration currently works.

The test cases presented are flawed because they assume an existing large history of backup points and the time when the test is run is different to the timestamp for each backup.

In my experience running the tool so far, I am continually finding gaps in daily backups where there is not exactly 86,400 seconds between backups. Worse than that, I am finding zero weekly backups are being kept after 30 days.

For example, after running daily backups since 2018-04-17, with default strategy (1:1, 30:7, 365:30), here is what I have on several backup points:

Example 1:

d--------- 25 root root 4096 May 22 21:53 2018-06-02-000012
d--------- 25 root root 4096 May 22 21:53 2018-06-04-000013
d--------- 25 root root 4096 May 22 21:53 2018-06-06-000017
d--------- 25 root root 4096 May 22 21:53 2018-06-07-000019
d--------- 25 root root 4096 May 22 21:53 2018-06-09-000016
d--------- 25 root root 4096 May 22 21:53 2018-06-11-000013
d--------- 25 root root 4096 May 22 21:53 2018-06-13-000018
d--------- 25 root root 4096 May 22 21:53 2018-06-15-000017
d--------- 25 root root 4096 May 22 21:53 2018-06-17-000018
d--------- 25 root root 4096 May 22 21:53 2018-06-18-000020
d--------- 25 root root 4096 May 22 21:53 2018-06-20-000014
d--------- 25 root root 4096 May 22 21:53 2018-06-21-000021
d--------- 25 root root 4096 May 22 21:53 2018-06-23-000013
d--------- 25 root root 4096 May 22 21:53 2018-06-24-000017
d--------- 25 root root 4096 May 22 21:53 2018-06-25-000017
d--------- 25 root root 4096 May 22 21:53 2018-06-27-000012
d--------- 25 root root 4096 May 22 21:53 2018-06-29-000017
d--------- 25 root root 4096 May 22 21:53 2018-06-30-000018
drwxr-xr-x  2 root root 4096 Jun 30 00:00 backup_log
-rw-r--r--  1 root root    0 Apr 17 11:06 backup.marker
lrwxrwxrwx  1 root root   17 Jun 30 00:00 latest -> 2018-06-30-000018

Example 2:

d--------- 31 root root 4096 May 24 00:28 2018-06-02-000129
d--------- 31 root root 4096 May 24 00:28 2018-06-04-000155
d--------- 31 root root 4096 May 24 00:28 2018-06-06-000122
d--------- 31 root root 4096 May 24 00:28 2018-06-08-000141
d--------- 31 root root 4096 May 24 00:28 2018-06-09-000209
d--------- 31 root root 4096 May 24 00:28 2018-06-11-000120
d--------- 31 root root 4096 May 24 00:28 2018-06-13-000155
d--------- 31 root root 4096 May 24 00:28 2018-06-15-000138
d--------- 31 root root 4096 May 24 00:28 2018-06-17-000145
d--------- 31 root root 4096 May 24 00:28 2018-06-18-003218
d--------- 31 root root 4096 May 24 00:28 2018-06-19-010608
d--------- 31 root root 4096 May 24 00:28 2018-06-20-010710
d--------- 31 root root 4096 May 24 00:28 2018-06-22-000118
d--------- 31 root root 4096 May 24 00:28 2018-06-23-000126
d--------- 31 root root 4096 May 24 00:28 2018-06-25-000112
d--------- 31 root root 4096 May 24 00:28 2018-06-27-000104
d--------- 31 root root 4096 May 24 00:28 2018-06-29-000112
d--------- 31 root root 4096 May 24 00:28 2018-06-30-000140
drwxr-xr-x  2 root root 4096 Jun 30 00:02 backup_log
-rw-r--r--  1 root root    0 Apr 17 11:07 backup.marker
lrwxrwxrwx  1 root root   17 Jun 30 00:02 latest -> 2018-06-30-000140

Example 3:

d--------- 20 root root 4096 Oct 16  2017 2018-06-02-000002
d--------- 20 root root 4096 Oct 16  2017 2018-06-03-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-04-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-05-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-06-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-07-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-08-000004
d--------- 20 root root 4096 Oct 16  2017 2018-06-09-000004
d--------- 20 root root 4096 Oct 16  2017 2018-06-10-000004
d--------- 20 root root 4096 Oct 16  2017 2018-06-11-000004
d--------- 20 root root 4096 Oct 16  2017 2018-06-12-000004
d--------- 20 root root 4096 Oct 16  2017 2018-06-13-000004
d--------- 20 root root 4096 Oct 16  2017 2018-06-14-000012
d--------- 20 root root 4096 Oct 16  2017 2018-06-16-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-18-000002
d--------- 20 root root 4096 Oct 16  2017 2018-06-19-000005
d--------- 20 root root 4096 Oct 16  2017 2018-06-21-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-22-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-23-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-25-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-27-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-28-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-29-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-30-000002
drwxr-xr-x  2 root root 4096 Jun 30 00:00 backup_log
-rw-r--r--  1 root root    0 Apr 17 10:26 backup.marker
lrwxrwxrwx  1 root root   17 Jun 30 00:00 latest -> 2018-06-30-000002

Note, the issues:

  1. Gaps in daily backups
  2. Not a single weekly backup kept after 30 days

I know there are gaps, because my logs show that backups have run every single day - at varying levels of minutes/seconds past midnight.

This is a serious fault introduced by the recent new code to allow strategy customisation - as it requires specific 86,400 seconds between 24 hour backups, and the backup strategy test cases assume a history of backups (when normally, they will be added daily, and each prune to strategy will be comparing the time it is run).

I think the only workaround at this time is to resort to retaining all daily backups (ie. --strategy 1:1) and manually pruning until it is resolved.

I would love to help further, but I am really not familiar with scripting - nor do I know how to write code that would remain correct across all the platforms that this script supports.

laurent22 commented 6 years ago

I see it's indeed an issue for the daily backups. Making the interval a bit less strict like 85,000 seconds would handle this better, but not sure if that's a good solution.

Do you have any idea though why there's no backup after 30 days? Any way to replicate this bug?

yorkday commented 6 years ago

Hi Laurent,

I have replicated the bug through some hacking at the code.

Environment: macOS High Sierra 10.13.5

Steps to replicate bug in new backup:

  1. Download latest version of rsync_tmbackup
  2. Create empty source and target folders
  3. Touch backup.marker in target folder
  4. Modify rsync_tmbackup.sh
    • set NOW and EPOCH variables manually to replicate as-was backup timing
    • place NOW and EPOCH variables on lines 4 and 5 of script and comment out lower in script
    • NOW=$(date +"%Y-06-10-000001")
    • EPOCH=$(date -j -f "%d-%B-%y-%T" 10-JUN-18-00:00:01 "+%s")
  5. Run script with following parameters:
    • Strategy is keep daily after 1 day, keep weekly after 8 days
    • ./rsync_tmbackup.sh --strategy "1:1 8:7" --log-dir /Users/yday/Downloads/rsync-time-backup-master ./source ./target
  6. After each run of script, increment the dates for NOW and EPOCH by 1 day
  7. After 8 runs, the target directory looks like this:
    drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-10-000001
    drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-11-000001
    drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-12-000001
    drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-13-000001
    drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-14-000001
    drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-15-000001
    drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-16-000001
    drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-17-000001
    -rw-r--r--   1 yday  staff     0  2 Jul 21:24 backup.marker
    lrwxr-xr-x   1 yday  staff    17  2 Jul 21:27 latest -> 2018-06-17-000001
  8. After the 9th, run, the first backups is pruned, rather than being kept as the first weekly backup
    rsync_tmbackup: Creating destination ./target/2018-06-18-000001
    rsync_tmbackup: Expiring ./target//2018-06-10-000001
    rsync_tmbackup: Starting backup...
    • Target directory looks like:
      drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-11-000001
      drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-12-000001
      drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-13-000001
      drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-14-000001
      drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-15-000001
      drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-16-000001
      drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-17-000001
      drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-18-000001
      -rw-r--r--   1 yday  staff     0  2 Jul 21:24 backup.marker
      lrwxr-xr-x   1 yday  staff    17  2 Jul 21:28 latest -> 2018-06-18-000001
  9. Now every subsequent run prunes the oldest daily backup, not keeping any weekly backups
  10. After 10 more runs, target directory looks like:
    drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-23-000001
    drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-24-000001
    drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-25-000001
    drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-26-000001
    drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-27-000001
    drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-28-000001
    drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-29-000001
    drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-30-000001
    -rw-r--r--    1 yday  staff     0  2 Jul 21:24 backup.marker
    lrwxr-xr-x    1 yday  staff    17  2 Jul 21:31 latest -> 2018-06-30-000001

Bug does not occur for existing backups with history: However, if I pre-populate the target directory with a large history of existing directories, it correctly prunes dailies and weeklies as expected:

  1. Target directory:
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-17-102937
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-17-170951
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-18-183109
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-20-220247
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-21-170314
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-21-222356
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-21-223504
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-22-000003
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-23-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-24-000003
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-25-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-26-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-27-000003
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-28-000003
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-29-000003
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-30-000012
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-01-000003
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-02-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-03-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-04-000003
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-05-000003
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-06-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-07-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-08-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-09-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-10-000005
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-11-000005
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-12-000005
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-13-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-14-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-15-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-16-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-17-000005
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-18-000003
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-19-000005
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-20-000005
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-21-000005
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-22-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-22-222223
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-23-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-24-000008
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-25-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-26-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-27-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-28-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-29-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-30-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-31-000005
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-01-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-02-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-03-000007
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-04-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-05-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-06-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-07-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-08-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-09-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-10-000007
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-11-000005
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-12-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-13-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-14-000015
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-15-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-16-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-17-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-18-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-19-000010
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-20-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-21-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-22-000007
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-23-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-24-000007
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-25-000007
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-26-000008
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-27-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-28-000007
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-29-000006
    -rw-r--r--    1 yday  staff     0  2 Jul 21:32 backup.marker
  2. After one run:
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-18-183109
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-26-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-03-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-10-000005
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-18-000003
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-25-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-01-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-08-000006
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-16-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-23-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-24-000007
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-25-000007
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-27-000004
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-28-000007
    drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-29-000006
    drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-30-000001
    -rw-r--r--    1 yday  staff     0  2 Jul 21:32 backup.marker
    lrwxr-xr-x    1 yday  staff    17  2 Jul 21:33 latest -> 2018-06-30-000001
  3. Note the 26/06 daily is still incorrectly pruned (however this is the absolute seconds issue previously described).

Summary: So, I think to summarise:

  1. Pruning strategy is not working for new backups with no history
  2. Pruning strategy works when there is existing backup history

It would appear the cause of this issue for new backups is because there is no historical backups old enough to meet the current strategy, the first backup to roll off the previous strategy is pruned, whereas it should be kept. In my example, on the 9th run, the oldest backup is pruned, because it is not 7 x 86,400 seconds older than the next backup. This is why no backups after kept after 8 days (or 30 days for default strategy), because no new daily backup will ever meet the age rule.

The reason why this bug was not picked up, was because:

  1. The test scripts assumed an existing history of directories
  2. The test scripts did not set the NOW and EPOCH times as-was to verify how the backup would behave over time (only run at current date/time)

In the above example, the 2018-06-10 backup should have been kept as the first weekly backup, and backups between 2018-06-11 to 2018-06-16 should have been pruned.

One possible solution to deal with daily backups being pruned, is not to measure the distance between backups in absolute seconds, but to measure the difference in days between individual backups.

For example, a backup at 2018-06-10-235959 and one at 2018-06-11-000000 should each be kept as daily backups, even though they are 1 second apart (currently the script would prune the first backup, even though it was the only backup taken for that day). The importance is not that they are 86,400 seconds apart (distance between backups), it is that they are 1 day apart (absolute date of backups), and are valid to be retained for a daily backup strategy.

I would also like to see the strategy adjusted to use human readable time periods (e.g. weeks, months, years) rather than absolute days. People in general do not want to keep backups every 30 days, they want to keep monthly backups, which has to deal with months with days of 28, 29, 30, 31 days. Unfortunately this would mean a complete re-write of the pruning strategy - which is also why I suggested the approach used by restic: http://restic.readthedocs.io/en/latest/060_forget.html

I hope this helps.

Albert444 commented 5 years ago

Hi,

first of all a big thank you to laurent22 for writing this very nice and handy script! It is most usefull for doing all kinds of regular reliable backups.

Although I love the script I ran into the bug as well in the daily routine (Note, it does not occur with your test script due to the reasons discussed above). I am not a pro in scripting and could not do it as elegant as it should be, but I think I found a solution of altering the script at two locations that seems to work for me.

First, I changed linux*) date -d "${1:0:10} ${1:11:2}:${1:13:2}:${1:15:2}" +%s ;; to linux*) date -d "${1:0:10}" +%s ;; in order to set the time after a day artificially to midnight so that slighty different backup-times per day (due to different amount of backup work when you run the script several times for different directories) do not matter.

Second, I changed the contents of the fn_expire_backup function by using a modulo function:

fn_expire_backups() {
    local current_timestamp=$EPOCH
    local start_timestamp=$(fn_parse_date "2018-10-10-000000")
    local last_kept_timestamp=9999999999

    # Process each backup dir from most recent to oldest
    for backup_dir in $(fn_find_backups | sort -r); do
        local backup_date=$(basename "$backup_dir")
        local backup_timestamp=$(fn_parse_date "$backup_date")

        # Skip if failed to parse date...
        if [ -z "$backup_timestamp" ]; then
            fn_log_warn "Could not parse date: $backup_dir"
            continue
        fi

        # Find which strategy token applies to this particular backup
        for strategy_token in $(echo $EXPIRATION_STRATEGY | tr " " "\n" | sort -r -n); do
            IFS=':' read -r -a t <<< "$strategy_token"

            # After which date (relative to today) this token applies (X)
            local cut_off_timestamp=$((current_timestamp - ${t[0]} * 86400))

            # Every how many days should a backup be kept past the cut off date (Y)
            local cut_off_interval=$((${t[1]} * 86400))

            # If we've found the strategy token that applies to this backup
            if [ "$backup_timestamp" -le "$cut_off_timestamp" ]; then

                # Special case: if Y is "0" we delete every time
                if [ $cut_off_interval -eq "0" ]; then
                    fn_expire_backup "$backup_dir"
                    break
                fi

                # Check if the current backup is in the interval between
                # the last backup that was kept and Y
                local interval_since_last_kept=$((last_kept_timestamp - backup_timestamp))

                dayssincestart=$(((backup_timestamp-start_timestamp)/86400))
                modulo=$((dayssincestart % (cut_off_interval/86400)))

                if [ "$modulo" -ne "0" ]; then
                    # Yes: Delete that one
                    fn_expire_backup "$backup_dir"
                else
                    if [ "$interval_since_last_kept" -lt "86400" ]; then #if diffenrence is less than one day
                        fn_expire_backup "$backup_dir"
                    else
                        # No: Keep it
                        last_kept_timestamp=$backup_timestamp
                    fi
                fi
                break
            fi
        done
    done
}

It is just a suggestion, maybe laurent22 or anybody else can check it and build it in a more elegant way into the script.

I hope this helps somebody...

hazelra commented 5 years ago

I appreciate all the work done here! I think this is a great way to make backups. I've had the same issue with the expiration strategy and pruning not quite working as expected. Has the issue been corrected in the latest code from 7 months ago?

yorkday commented 5 years ago

Hi @hazelra, unfortunately I don't think the issue has been corrected since this issue was raised. I have lived with the issue by choosing to keep all daily backups forever. Even so, not all daily backups are kept due to the absolute duration in seconds required between two backups run on two separate days.

From a backup perspective, I continue to use this tool for its unencrypted, local backup capabilities. Restoration is as simple as browsing to the directory and copying files if I needed anything back. For more robust backups, I choose to use http://restic.net because it is under constant development and improvement, has great encryption and cloud backup capabilities. Restic has much more capable and flexible pruning capability, but it enforces encryption which isn't what I want for my local backup solution.

I think if you want something simple and easy to use without needing encryption, this tool could be OK. For anything else, I'd recommend something like restic which is more robust and feature rich.

Unfortunately I don't have the skills to modify the code, so I'm stuck with what it provides (but still 100% grateful for the work that went into it).

Albert444 commented 5 years ago

Hi, I think this simple script is really good - especially with the ability to have space-saving incremental backups and nevertheless the full browsing experience in the file explorer within the different backups.

I suggested a rough solution for the bug three comments above and so far it is working for me. You could try to open the script e.g. with nano and replace at the two locations the original code with my code suggestions. Then just check over a few days with a test-setup whether this would serve your needs...

I am well aware that a real good coder would do it maybe in a more elegant way. So maybe some is willing to take my idea and build it into the official script if the solution is fit for it...

Best greetings....

hazelra commented 5 years ago

Thanks Yorkday. That's definitely a good solution to save all backups and have restic do the pruning. If you have any code or scripts that show how you use restic, I and maybe others would find that useful.

Albert, I'm not much of a coder, but I may be able to figure out which lines of code to replace from your post. The first change seems straight forward. The second, which I'll have to check, I'm not sure how much of the original code to cut out. Maybe there is a more elegant way of fixing it, but in my experience there is nothing wrong with inelegant that gets the job done. Hazel

kapitainsky commented 5 years ago

I find this script very useful and expiration bug very annoying. As I have not see much activity here I decided to DIY it. I think I have now working fix. Anybody interested please have a look at my fork at https://github.com/kapitainsky/rsync-time-backup in bugfix-fn_expire_backups branch. Only change is modified fn_expire_backups function. Here it is the new one:

fn_expire_backups() {
    local current_timestamp=$EPOCH
    local last_kept_timestamp=9999999999

    # Process each backup dir from the oldest to the most recent
    for backup_dir in $(fn_find_backups | sort); do

        local backup_date=$(basename "$backup_dir")
        local backup_timestamp=$(fn_parse_date "$backup_date")

        # Skip if failed to parse date...
        if [ -z "$backup_timestamp" ]; then
            fn_log_warn "Could not parse date: $backup_dir"
            continue
        fi

        # If this is the first "for" iteration backup_dir points to the oldest backup
        if [ "$last_kept_timestamp" == "9999999999" ]; then
            # We dont't want to delete the oldest backup. It becomes first "last kept" backup
            last_kept_timestamp=$backup_timestamp
            # As we keep it we can skip processing it and go to the next oldest one
            continue
        fi

        # Find which strategy token applies to this particular backup
        for strategy_token in $(echo $EXPIRATION_STRATEGY | tr " " "\n" | sort -r -n); do
            IFS=':' read -r -a t <<< "$strategy_token"

            # After which date (relative to today) this token applies (X) - we use seconds to get exact cut off time
            local cut_off_timestamp=$((current_timestamp - ${t[0]} * 86400))

            # Every how many days should a backup be kept past the cut off date (Y) - we use days (not seconds)
            local cut_off_interval_days=$((${t[1]}))

            # If we've found the strategy token that applies to this backup
            if [ "$backup_timestamp" -le "$cut_off_timestamp" ]; then

                # Special case: if Y is "0" we delete every time
                if [ $cut_off_interval_days -eq "0" ]; then
                    fn_expire_backup "$backup_dir"
                    break
                fi

                # we calculate days number since last kept backup
                local last_kept_timestamp_days=$((last_kept_timestamp / 86400))
                local backup_timestamp_days=$((backup_timestamp / 86400))
                local interval_since_last_kept_days=$((backup_timestamp_days - last_kept_timestamp_days))

                # Check if the current backup is in the interval between
                # the last backup that was kept and Y
                # to determine what to keep/delete we use days difference
                if [ "$interval_since_last_kept_days" -lt "$cut_off_interval_days" ]; then

                    # Yes: Delete that one
                    fn_expire_backup "$backup_dir"
                    # backup deleted no point to check shorter timespan strategies - go to the next backup
                    break
                else
                    # No: Keep it.
                    # af we keep it this is now the last kept backup
                    last_kept_timestamp=$backup_timestamp
                    # and go to the next backup
                    break
                fi
            fi
        done
    done
}

I would appreciate any comments. I am sure it can be coded in more elegant way but I just needed a fix.

kjyv commented 5 years ago

I'm trying out your updated expiry logic with my daily backup and I'm now seeing the script expire the last backup. There are two backups from that day but I think it should expire the earlier one, not the latest one? Also, there should be a proper runnable automatic test for the expiry logic anyway with all edge cases. Your approach using full days instead of seconds sounds good though.

kapitainsky commented 5 years ago

it will always keep the oldest backup. It is intentional - I agree it might not suit everybody's taste:) The logic behind it is that often people start backup service and when first backup if finished they feel that their files are safely stored in backup. Then they delete them from the source to free some space... and then when pruning kicks in data is gone

kapitainsky commented 5 years ago

I posted it here to get some feedback - details can be easily changed. Now as I got the right way to handle expiry logic it is quite simple. I did reasonable amount of testing including many edge cases before publishing it - i am pretty much confident in this function logic. But of course some bugs are always possible.

kjyv commented 5 years ago

You mean it will always keep the initial backup or the oldest backup in the time span? In my example, the previous backup on that day is not the initial backup, there are other backups before it. You could argue it is not that important on one day but if it is within a week or month, I guess the newer backup is more important.

kapitainsky commented 5 years ago

Initial one.

kjyv commented 5 years ago

Initial is fine and makes sense, but that is not what I meant, see above. In general btw, it's great you took the time to improve the expiry logic, hope it works well otherwise :)

kapitainsky commented 5 years ago

this is all relative - "day" is just some time measure. It will keep one backup per day (if requested in strategy). but it won't be exactly last backup from calendar day. If you run hourly backups it is not so important IMHO if backup from 23:00 or from 01:00 is kept

kapitainsky commented 5 years ago

I might modify it and make it more "human" - to keep last backup from calendar day. I give it some time though to see if maybe some other bugs are discovered.

kjyv commented 5 years ago

A real issue I noticed now is that it uses the last backup as --link-dest in the rsync command, though this backup directory has now been expired. I guess expiry should in general happen after doing the backup to prevent transferring files again that have just been deleted before.

kjyv commented 5 years ago

I'm not running hourly backups btw, I guess that is only one (maybe not very likely) use case. I'm not always in the same network that my backup location is at, so I might not have any backups for multiple days and expiry should still work sensibly.

kapitainsky commented 5 years ago

good point. thanks for this comment. I will look into making it more reasonable.

kapitainsky commented 5 years ago

A real issue I noticed now is that it uses the last backup as --link-dest in the rsync command, though this backup directory has now been expired. I guess expiry should in general happen after doing the backup to prevent transferring files again that have just been deleted before.

Agree with this. In general it is real shame that this repo is not maintained anymore. Script does fantastic job but few things should be fixed.

laurent22 commented 5 years ago

The issue is that it's difficult to test these pull request and I don't want to push a change that will break people's backup scripts. If there's a consensus on an expiration strategy PR and it's been tested by a few users, I'm happy to merge it.

filippocarletti commented 5 years ago

@laurent22 the NethServer community is testing @kapitainsky version of the expiration strategy. https://community.nethserver.org/t/rsync-engine-old-backups-missing/ It's working as expected.

hazelra commented 5 years ago

This is excellent news. I would be interested in finding out how it's working for people keeping and expiring backups. Will the code move into the Git project at some point if it continues to work as expected?

On Tue, Jun 11, 2019 at 10:41 AM Filippo Carletti notifications@github.com wrote:

@laurent22 https://github.com/laurent22 the NethServer community is testing @kapitainsky https://github.com/kapitainsky version of the expiration strategy. https://community.nethserver.org/t/rsync-engine-old-backups-missing/ It's working as expected.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/laurent22/rsync-time-backup/issues/121?email_source=notifications&email_token=ALE75G66X3J2IXLKUZWIAOTPZ62RPA5CNFSM4FAWUYG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXNLNXY#issuecomment-500872927, or mute the thread https://github.com/notifications/unsubscribe-auth/ALE75G6BIHFAMDAQZOMNTQ3PZ62RPANCNFSM4FAWUYGQ .

-- David Bertenthal ber10thal.com

laurent22 commented 5 years ago

That's great, thanks for the feedback @filippocarletti. I've just realised I'm not using that pull request myself for my backups so I'm going to start doing so as well, just in case I notice any issue. I think we can probably merge quite soon.

kjyv commented 5 years ago

Works fine for me, too. I made expiry happen after the backups though. Adding a proper test for the behavior shouldn't be too hard though and should be the prerequisite for adding new code like this since it possibly deletes valuable data for people using this in production. Just create a few empty directories and check if the right ones are being deleted.

kapitainsky commented 4 years ago

A real issue I noticed now is that it uses the last backup as --link-dest in the rsync command, though this backup directory has now been expired. I guess expiry should in general happen after doing the backup to prevent transferring files again that have just been deleted before.

I have corrected this issue now. Original pruning function can be also effected in some rare edge cases.

Now my function can explicitly preserved some backup e.g. when --link-dest is used.

All is included in pull request #166

kapitainsky commented 4 years ago

If anybody is happy to give it a try you can get my version here: ‘git clone https://github.com/kapitainsky/rsync-time-backup.git -b kptsky_all_OK’

purpltentacle commented 4 years ago

If anybody is happy to give it a try you can get my version here: ‘git clone https://github.com/kapitainsky/rsync-time-backup.git -b kptsky_all_OK’

Seems to work fine on my CentOS 7. Thanks.

kapitainsky commented 4 years ago

My pull request has been now included in this repo. Let’s hope this issue can be closed soon:)

reactive-firewall commented 4 years ago

What's the status of this ?

hazelra commented 4 years ago

I’ve been using it for several months now and it seems to be working properly after the patches. Another confirmation of this would be helpful.

On Mar 18, 2020, at 23:35, Mr. Walls notifications@github.com wrote:

 What's the status of this ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

Neustradamus commented 4 years ago

Any news?

yorkday commented 4 years ago

As the original creator of this issue, I can confirm it is resolved and working as expected - so the issue should be closed as it's no longer a bug. I've been running the script for some time now. In terms of improving the retention methodology further, my comments still stand as per the original post about replicating the type of retention process used by restic. The current implementation works as designed, but the use of 'days' as the pruning grain does not allow flexibility like restic of monthly, yearly pruning grains (and the improved simplicity of the restic pruning command line).