do not send intermediate snapshots

lotheac commented 4 years ago

Hi,

it seems znapzend is using zfs send -I to send snapshots to destination. We found this to be surprising, since it includes intermediate snapshots (in our setup we utilise snapshots taken on the source for other purposes than backup, eg. synchronizing most current state to other production nodes). Consider the following setup:

% znapzendzetup create SRC '5min=>30sec' buildpool/zztest/src DST '5min=>30sec' buildpool/zztest/dst
*** backup plan: buildpool/zztest/src ***
           dst_0 = buildpool/zztest/dst
      dst_0_plan = 5minutes=>30seconds
         enabled = on
         mbuffer = off
    mbuffer_size = 1G
   post_znap_cmd = off
    pre_znap_cmd = off
       recursive = off
             src = buildpool/zztest/src
        src_plan = 5minutes=>30seconds
        tsformat = %Y-%m-%d-%H%M%S
      zend_delay = 0

Do you want to save this backup set [y/N]? y
NOTE: if you have modified your configuration, send a HUP signal
(pkill -HUP znapzend) to your znapzend daemon for it to notice the change.

... wait for znapzend to operate the first time ...

terra ~ % zfs list -rt all buildpool/zztest
NAME                                     USED  AVAIL  REFER  MOUNTPOINT
buildpool/zztest                          72K   890G    24K  legacy
buildpool/zztest/dst                      24K   890G    24K  legacy
buildpool/zztest/dst@2019-11-15-151000     0B      -    24K  -
buildpool/zztest/src                      24K   890G    24K  legacy
buildpool/zztest/src@2019-11-15-151000     0B      -    24K  -

then take a snapshot manually:

terra ~ % zfs snapshot buildpool/zztest/src@manual-snapshot

What we expected to happen here is that the manually taken snapshot is not sent to DST, but because of -I, it actually is. But there is nothing that would ever clean it up from there (even if it was eventually destroyed on source). But -I happily sends all the intermediate snapshots too so manual-snapshot also ends up on DST:

terra ~ % zfs list -rt all buildpool/zztest
NAME                                     USED  AVAIL  REFER  MOUNTPOINT
buildpool/zztest                          72K   890G    24K  legacy
buildpool/zztest/dst                      24K   890G    24K  legacy
buildpool/zztest/dst@2019-11-15-151000     0B      -    24K  -
buildpool/zztest/dst@manual-snapshot       0B      -    24K  -
buildpool/zztest/dst@2019-11-15-151030     0B      -    24K  -
buildpool/zztest/src                      24K   890G    24K  legacy
buildpool/zztest/src@2019-11-15-151000     0B      -    24K  -
buildpool/zztest/src@manual-snapshot       0B      -    24K  -
buildpool/zztest/src@2019-11-15-151030     0B      -    24K  -

I cannot see a reason to not use -i instead, to make sure that znapzend does not put snapshots it will never destroy onto DST. So here's a diff to avoid intermediate snapshots from ending up on DST, by changing send -I to send -i.

coveralls commented 4 years ago

Coverage remained the same at 89.822% when pulling 2423cc31d96c76e8c574d637203b67f6789a3dd9 on lotheac:master into bfb68766ba5a13bcaba87292cb04191138bc9d3e on oetiker:master.

coveralls commented 4 years ago

Coverage remained the same at 89.822% when pulling 2423cc31d96c76e8c574d637203b67f6789a3dd9 on lotheac:master into bfb68766ba5a13bcaba87292cb04191138bc9d3e on oetiker:master.

oetiker commented 4 years ago

The reason we are using -I is that in this way znapzend will pass on any intermediate snapshots it may have missed due to some network outage or because it took too long transfering a particularly big snapshot ...

to fight 'leftover' snapshots the remote cleanup routine would have to be enhanced ... this would be a worthwhile thing by all means !

lotheac commented 4 years ago

On Fri, Nov 15 2019 07:20:34 -0800, Tobias Oetiker wrote:

The reason we are using -I is that in this way znapzend will pass on any intermediate snapshots it may have missed due to some network outage or because it took too long transfering a particularly big snapshot ...

In my opinion, if it's important for znapzend to avoid "gaps" like this, it should instead try to send each missing snapshot (according to DST retention policy) to the destination. In a policy with, for example, 1d=>1h on SRC and 1month=>1d on DST, after such an outage of say, six hours, sending five intermediate hourlies just to destroy them afterwards doesn't strike me as particularly productive.

(BTW, we were also kind of surprised by the fact that znapzend sends to DST every hour in that scenario, as opposed to once a day. But that's a separate thing and I do kind of understand the reason there anyway.)

to fight 'leftover' snapshots the remote cleanup routine would have to be enhanced ... this would be a worthwhile thing by all means !

I think a safer position to take would be that znapzend does not send, receive, or destroy any snapshots that it has not taken or sent itself, so your proposal seems to me a bit dangerous :)

-- Lauri Tirkkonen | lotheac @ IRCnet

jimklimov commented 4 years ago

I think the issue is not brought up for the first time here :)

From my PoV, sending such manual intermediate snapshots is a good thing, though I agree that the stance is site- (and admin-)dependent so making this optional could be worthwhile. For me this is good because just today I had a server hicced up with too many old snapshots collected (destination problem). The fast way out was to make the manual snapshots, have znapzend do its magic on the larger sub-trees (much I/O so gigabytes of snaps for just megabytes of "live" data at any time), and then I knew that snapshots older than this manual one are safe to delete from the origin system so it's ZFS is no longer collapsing due to low free space and its fragmentation. Having the full znapzend run takes a large part of the day, if not several, on that box and its backup link, and we needed it back to life and service ASAP, so picking at the worst offenders quickly was worthwhile.

I agree this is not too common, but also not an excluded variant. Making it optional and non-default (e.g. part of runonce handling) is an option :)

On the technical side, I believe a zfs send -I | zfs recv... passing a queue of intermediate snaps can be orders of magnitude faster than a loop of truly incremental steps of one snapshot each. Both can be slower, however, than just sending the increments (maybe skipping some original snaps) that are relevant for the destination's retention policy. E.g. if you keep hourly snaps on origin and daily on backup, it is messy to send all the hourlies of recent day to backup and then remove 23 of them there. Looking at run logs, I feel this is the logic that happens today, but am not certain.

lotheac commented 4 years ago

On Thu, Nov 21 2019 05:54:40 -0800, Jim Klimov wrote:

On the technical side, I believe a zfs send -I | zfs recv... passing a queue of intermediate snaps can be orders of magnitude faster than a loop of truly incremental steps of one snapshot each.

of course it is, but we are talking about an error recovery scenario here, not normal operation. normally znapzend sends to DST as often as SRC is snapshotted.

Both can be slower, however, than just sending the increments (maybe skipping some original snaps) that are relevant for the destination's retention policy. E.g. if you keep hourly snaps on origin and daily on backup, it is messy to send all the hourlies of recent day to backup and then remove 23 of them there. Looking at run logs, I feel this is the logic that happens today, but am not certain.

yes, if you take hourly snaps on SRC but want to retain 1 daily snap on DST, znapzend sends each hourly to DST on the hour and removes the previous hourly.

-- Lauri Tirkkonen | lotheac @ IRCnet

jimklimov commented 4 years ago

Thanks @lotheac for the good points.

I proposed a compromise at https://github.com/lotheac/znapzend/pull/1 to have both camps satisfied, and aware of pitfalls (or benefits... POV-dependent...) with zfs send -I as well ;)

jimklimov commented 4 years ago

Dug a bit in the history (was interested if this was something I broke, or was some recent surprise...) and found that the big -I was there from the beginning:

jimklimov commented 4 years ago

Presumably this PR gets superseded by #459 ;)

oetiker / znapzend

do not send intermediate snapshots #455