digint / btrbk

Tool for creating snapshots and remote backups of btrfs subvolumes
https://digint.ch/btrbk/
GNU General Public License v3.0
1.6k stars 117 forks source link

raw target: add option to not use the closest parent but the one that is the last to be rotatet away #508

Closed calestyo closed 1 year ago

calestyo commented 1 year ago

Hey.

I noticed this while trying out the branch from #474 ... I'm not sure whether what I describe below could already be done via incremental_prefs, if so I didn’t understand it.

One thing I’ve noticed (when trying https://github.com/digint/btrbk/commits/delete-incremental-raw) is what I have right now:

-       backup1.example.org  /var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T000012+0100.btrfs.gpg  5h      preserve hourly: first of hour, 2 hours ago
-       backup1.example.org  /var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T004649+0100.btrfs.gpg  5h      preserve forced: parent of preserved raw target
-       backup1.example.org  /var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T010012+0100.btrfs.gpg  5h      preserve hourly: first of hour, 1 hours ago
-       backup1.example.org  /var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T020012+0100.btrfs.gpg  5h      preserve hourly: first of hour, 0 hours ago
-       backup1.example.org  /var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T023659+0100.btrfs.gpg  5h      preserve min: latest

with the following files:

-rw-r--r-- 1 root root  220 Nov 20 00:04 data.20221120T000012+0100.btrfs.gpg.info
-rw-r--r-- 1 root root  93M Nov 20 00:46 data.20221120T004649+0100.btrfs.gpg
-rw-r--r-- 1 root root  268 Nov 20 00:46 data.20221120T004649+0100.btrfs.gpg.info
-rw-r--r-- 1 root root 8,3M Nov 20 01:00 data.20221120T010012+0100.btrfs.gpg
-rw-r--r-- 1 root root  268 Nov 20 01:00 data.20221120T010012+0100.btrfs.gpg.info
-rw-r--r-- 1 root root  96M Nov 20 02:00 data.20221120T020012+0100.btrfs.gpg
-rw-r--r-- 1 root root  268 Nov 20 02:00 data.20221120T020012+0100.btrfs.gpg.info

i.e. only the oldest one is a full backup and the following sidecars:

-e -n #btrbk-v0.32.5
# Do not edit this file
#t=1668898813
TYPE=raw
FILE=data.20221120T000012+0100.btrfs.gpg
RECEIVED_UUID=6fe09732-fdac-0a47-9087-0b4191679ef9
encrypt=gpg
INCOMPLETE=1

-e -n #t=1668899065
INCOMPLETE=0

#btrbk-v0.32.6-dev
# Do not edit this file
#t=1668901610
TYPE=raw
FILE=data.20221120T004649+0100.btrfs.gpg
RECEIVED_UUID=e63b0251-d34e-484d-939e-388a9dc40b51
RECEIVED_PARENT_UUID=6fe09732-fdac-0a47-9087-0b4191679ef9
encrypt=gpg
INCOMPLETE=1
#t=1668901616
INCOMPLETE=0
#btrbk-v0.32.6-dev
# Do not edit this file
#t=1668902414
TYPE=raw
FILE=data.20221120T010012+0100.btrfs.gpg
RECEIVED_UUID=e51b9ff7-9260-8247-a103-f57ed959ccff
RECEIVED_PARENT_UUID=e63b0251-d34e-484d-939e-388a9dc40b51
encrypt=gpg
INCOMPLETE=1
#t=1668902415
INCOMPLETE=0
#btrbk-v0.32.6-dev
# Do not edit this file
#t=1668906012
TYPE=raw
FILE=data.20221120T020012+0100.btrfs.gpg
RECEIVED_UUID=f99fd7a5-ed28-6541-943c-1def6cc571d5
RECEIVED_PARENT_UUID=e51b9ff7-9260-8247-a103-f57ed959ccff
encrypt=gpg
INCOMPLETE=1
#t=1668906019
INCOMPLETE=0

AFAIU, the one from 20221120T004649 would in principle be deleted, because I have 5h and there’s also the one from 20221120T000012 (which is closer to the full hour at 20221120T0000 and thus the one to be kept), but isn’t because its the parent of 20221120T010012 and the other following incremental dumps.
Right so far?

In principle the rotating is of course correct, but wouldn't it perhaps make sense here to not use the closest parent (20221120T004649) when creating the incremental 20221120T010012, but the closest one that is (with the current retention policy) the last to be rotated away?

The idea here is that one typically has some cron/systemd timer, which would then create cycles of full/incremental backups... while in addition there might be manually created ones (like above 20221120T004649). And then one wants the normal cycle to be used, rather than such intermediate backups, which should go away after target_preserve_min.

OTOH, any runs without such option (i.e. manual runs) would indeed use the closest parent.

Not sure whether this makes sense. ^^

In any case... it’s not really important to me.

Cheers, Chris.

digint commented 1 year ago

Can't do easily, and does not make much sense in my opinion:

calestyo commented 1 year ago

Well, the only reason I thought this might possibly make sense is saving disk space (which is however anyway not that important, given that we're talking about incremental dumps).

The idea was basically, that one likely wants to keep the "scheduled" backups, just as they're scheduled... i.e. to have a wayback machine for different time periods.
The "manual" backups however, may be just some temporary thing, e.g. when trying out some new upgrades or so.
But when the further (incremental) "scheduled" backups would be based on that "manual" one,... it would need to be kept until all are rotated away.

Anyway as I've already said before,... I wasn't even sure myself how useful that would actually be.

calestyo commented 1 year ago

Just another similar thing for your attention/consideration, before we can close this issue:

Again when trying out the incremental rotation, I played a bit around with faketime … btrbk -n -S run --override incremental=no in order to see when the current backups would be rotated away.

What I noticed, I guess, is that under some bad circumstances one might run out of backups (well at least all but the latest one), which again, may destroy any wayback functionality that people might want:

I had basically the following:

# faketime "Nov 21 07:05:40 CET 2022" btrbk -n -S run --override incremental=no
WARNING: Found deprecated option "btrfs_commit_delete each" in "/etc/btrbk/btrbk.conf" line 93
WARNING: ... Please use "btrfs_commit_delete yes|no"
WARNING: ... Using "btrfs_commit_delete yes"
SNAPSHOT SCHEDULE
-----------------
ACTION  SUBVOLUME                                                                          SCHEME                     REASON
delete  /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221118T000002+0100  2d+ 7h 1d (sunday, 00:00)  -
delete  /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221118T040012+0100  2d+ 7h 1d (sunday, 00:00)  -
delete  /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221118T080012+0100  2d+ 7h 1d (sunday, 00:00)  -
delete  /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221118T120003+0100  2d+ 7h 1d (sunday, 00:00)  -
delete  /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221118T160012+0100  2d+ 7h 1d (sunday, 00:00)  -
delete  /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221118T200007+0100  2d+ 7h 1d (sunday, 00:00)  -
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221119T000011+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 2 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221119T040012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 2 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221119T080012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 2 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221119T120003+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 2 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221119T160006+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 2 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221119T200012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 2 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T000012+0100  2d+ 7h 1d (sunday, 00:00)  preserve daily: first of day, 1 days ago, at 00:00
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T004649+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T010012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T020012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T030012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T040012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T050005+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T060012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T070012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T080002+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T090012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T100012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T110012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T120012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T130012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T140012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T150012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T160011+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T170012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T180012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T190012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T200009+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T210012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221120T220012+0100  2d+ 7h 1d (sunday, 00:00)  preserve min: 1 days ago
-       /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221121T070540+0100  2d+ 7h 1d (sunday, 00:00)  preserve daily: first of day, 0 days ago, 7h after 00:00

BACKUP SCHEDULE
---------------
ACTION  HOST                      SUBVOLUME                                                                                                        SCHEME  REASON
delete  backup1.example.org  /var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T160011+0100.btrfs.gpg  5h      -
delete  backup1.example.org  /var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T170012+0100.btrfs.gpg  5h      -
delete  backup1.example.org  /var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T180012+0100.btrfs.gpg  5h      -
delete  backup1.example.org  /var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T190012+0100.btrfs.gpg  5h      -
delete  backup1.example.org  /var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T200009+0100.btrfs.gpg  5h      -
delete  backup1.example.org  /var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T210012+0100.btrfs.gpg  5h      -
delete  backup1.example.org  /var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T220012+0100.btrfs.gpg  5h      -
-       backup1.example.org  /var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221121T070540+0100.btrfs.gpg  5h      preserve hourly: first of hour, 0 hours ago

--------------------------------------------------------------------------------
Backup Summary (btrbk command line client, version 0.32.6-dev)

    Date:   Mon Nov 21 07:05:40 2022
    Config: /etc/btrbk/btrbk.conf
    Dryrun: YES

Legend:
    ===  up-to-date subvolume (source snapshot)
    +++  created subvolume (source snapshot)
    ---  deleted subvolume
    ***  received subvolume (non-incremental)
    >>>  received subvolume (incremental)
--------------------------------------------------------------------------------
/data/btrfs-top-level-subvolumes/system/data
+++ /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221121T070540+0100
--- /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221118T000002+0100
--- /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221118T040012+0100
--- /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221118T080012+0100
--- /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221118T120003+0100
--- /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221118T160012+0100
--- /data/btrfs-top-level-subvolumes/system/snapshots/btrbk/data.20221118T200007+0100
*** backup1.example.org:/var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221121T070540+0100.btrfs.gpg
--- backup1.example.org:/var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T160011+0100.btrfs.gpg
--- backup1.example.org:/var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T170012+0100.btrfs.gpg
--- backup1.example.org:/var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T180012+0100.btrfs.gpg
--- backup1.example.org:/var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T190012+0100.btrfs.gpg
--- backup1.example.org:/var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T200009+0100.btrfs.gpg
--- backup1.example.org:/var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T210012+0100.btrfs.gpg
--- backup1.example.org:/var/local/lcg-backup/data/btrbk/source-host.example.org/system/data.20221120T220012+0100.btrfs.gpg

NOTE: Dryrun was active, none of the operations above were actually executed!

So what I think might happen is... consider the backups fail for a longer while, e.g. the target host is down or so.

When it comes up again, the most recent backups on it may already be well beyond their retention period (even if target_preserve_min was used) and all but the newly created one would be dropped.

In a way this is of course the desired behaviour... but it may also "break" the wayback functionality.

Not sure whether there's really anything one could do about it. Even something like an option that causes at least n backups to be retained (regardless of their times) would probably not help in every situation.
Or would you say that the solution for that is, that people simply need to retain backups long enough back into the past (like 1m or so)?

Anyway... feel free to close this issue! :-)

digint commented 1 year ago

I think what you describe here is pretty much a duplicate of #308

calestyo commented 1 year ago

Thanks for the pointer. :-)

Should you ever feel to implement https://github.com/digint/btrbk/issues/508#issue-1456733113, just re-open.

But as I've said... no strong desire from my side, thus closing.