Closed jimklimov closed 4 years ago
Seems the best place for the change is to first process $backupSet->{src}
(rather than generalized $srcSubDataSets
right away), using a destroySnapshots()
extended with recursive flag support like used in createSnapshots()
, around https://github.com/oetiker/znapzend/blob/4ccbe714186ac1fbc72a81e0548e6178279b8c76/lib/ZnapZend.pm#L354
Adding the recursion right into destroySnapshots()
as a forced codepath would likely backfire - child datasets won't have the named (and already listed in sendRecvCleanup()
) snaps to delete, so would call zfs
in vain and waste resources and/or fail. I guess recursive deletion should be independent of one-by-one mode with a list based on survivors of recursive deletion.
that sounds like a good aproach! looking forward to your PR
PR got to state where it seems to work for me and not produce surprises nor perl warnings, so feel free to test.
I made a stack of datasets and allowed my non-root user to play with those:
sudo zfs create rpool/export/test
sudo zfs create rpool/export/test/dst
sudo zfs create rpool/export/test/src
sudo zfs create rpool/export/test/src/child
sudo zfs allow -ldu jim clone,create,destroy,diff,mount,promote,rollback,snapshot,share,sharenfs,sharesmb,canmount,mountpoint,send,receive,mount,hold rpool/export/test
and made a setup for quick testing (every minute, little retention):
$ sudo ./bin/znapzendzetup create --recursive SRC '3min=>1min' rpool/export/test/src DST '1min=>1min' rpool/export/test/dst
*** backup plan: rpool/export/test/src ***
dst_0 = rpool/export/test/dst
dst_0_plan = 1minute=>1minute
enabled = on
mbuffer = off
mbuffer_size = 1G
post_znap_cmd = off
pre_znap_cmd = off
recursive = on
src = rpool/export/test/src
src_plan = 3minutes=>1minute
tsformat = %Y-%m-%d-%H%M%S
zend_delay = 0
and bombarded it with
$ ./bin/znapzend -d --features=oracleMode --runonce rpool/export/test/src
and
$ ./bin/znapzend -d --runonce rpool/export/test/src
Example output :
jim@jimoo018:~/shared/znapzend$ ./bin/znapzend -d --features=oracleMode --runonce rpool/export/test/src
[Thu Oct 11 23:02:36 2018] [info] znapzend (PID=1743) starting up ...
[Thu Oct 11 23:02:36 2018] [info] refreshing backup plans...
[Thu Oct 11 23:02:36 2018] [info] found a valid backup plan for rpool/export/test/src...
[Thu Oct 11 23:02:36 2018] [info] znapzend (PID=1743) initialized -- resuming normal operations.
[Thu Oct 11 23:02:36 2018] [debug] snapshot worker for rpool/export/test/src spawned (1747)
[Thu Oct 11 23:02:36 2018] [info] creating recursive snapshot on rpool/export/test/src
# zfs snapshot -r rpool/export/test/src@2018-10-11-230236
[Thu Oct 11 23:02:36 2018] [info] checking ZFS dependent datasets from 'rpool/export/test/src' explicitely excluded
# zfs list -H -o name -t filesystem,volume
# zfs get -H -s local -o value org.znapzend:enabled rpool/export/test/src
# zfs get -H -s local -o value org.znapzend:enabled rpool/export/test/src/child
[Thu Oct 11 23:02:36 2018] [debug] snapshot worker for rpool/export/test/src done (1747)
[Thu Oct 11 23:02:36 2018] [debug] send/receive worker for rpool/export/test/src spawned (1752)
[Thu Oct 11 23:02:36 2018] [info] starting work on backupSet rpool/export/test/src
# zfs list -H -r -o name -t filesystem,volume rpool/export/test/src
[Thu Oct 11 23:02:36 2018] [debug] sending snapshots from rpool/export/test/src to rpool/export/test/dst
# zfs list -H -o name -t snapshot -s creation -d 1 rpool/export/test/src
# zfs list -H -o name -t snapshot -s creation -d 1 rpool/export/test/dst
# zfs send -I rpool/export/test/src@2018-10-11-225226 rpool/export/test/src@2018-10-11-230236|zfs recv -F rpool/export/test/dst
[Thu Oct 11 23:02:37 2018] [debug] sending snapshots from rpool/export/test/src/child to rpool/export/test/dst/child
# zfs list -H -o name -t snapshot -s creation -d 1 rpool/export/test/src/child
# zfs list -H -o name -t snapshot -s creation -d 1 rpool/export/test/dst/child
# zfs send -I rpool/export/test/src/child@2018-10-11-225226 rpool/export/test/src/child@2018-10-11-230236|zfs recv -F rpool/export/test/dst/child
# zfs list -H -o name -t snapshot -s creation -d 1 rpool/export/test/dst
[Thu Oct 11 23:02:37 2018] [debug] cleaning up snapshots recursively under rpool/export/test/dst
# zfs destroy -r rpool/export/test/dst@2018-10-11-225208
# zfs destroy -r rpool/export/test/dst@2018-10-11-225226
[Thu Oct 11 23:02:37 2018] [debug] now will look if there is anything to clean in children of rpool/export/test/dst
# zfs list -H -o name -t snapshot -s creation -d 1 rpool/export/test/dst/child
# zfs list -H -o name -t snapshot -s creation -d 1 rpool/export/test/src
[Thu Oct 11 23:02:37 2018] [debug] cleaning up snapshots recursively under rpool/export/test/src
# zfs destroy -r rpool/export/test/src@2018-10-11-222250
# zfs destroy -r rpool/export/test/src@2018-10-11-222426
# zfs destroy -r rpool/export/test/src@2018-10-11-222505
# zfs destroy -r rpool/export/test/src@2018-10-11-222644
# zfs destroy -r rpool/export/test/src@2018-10-11-223057
# zfs destroy -r rpool/export/test/src@2018-10-11-223503
# zfs destroy -r rpool/export/test/src@2018-10-11-223610
# zfs destroy -r rpool/export/test/src@2018-10-11-223709
# zfs destroy -r rpool/export/test/src@2018-10-11-223954
# zfs destroy -r rpool/export/test/src@2018-10-11-224007
# zfs destroy -r rpool/export/test/src@2018-10-11-224150
# zfs destroy -r rpool/export/test/src@2018-10-11-224234
# zfs destroy -r rpool/export/test/src@2018-10-11-224311
# zfs destroy -r rpool/export/test/src@2018-10-11-224433
# zfs destroy -r rpool/export/test/src@2018-10-11-224506
# zfs destroy -r rpool/export/test/src@2018-10-11-224631
# zfs destroy -r rpool/export/test/src@2018-10-11-224716
# zfs destroy -r rpool/export/test/src@2018-10-11-224834
# zfs destroy -r rpool/export/test/src@2018-10-11-225120
# zfs destroy -r rpool/export/test/src@2018-10-11-225208
# zfs destroy -r rpool/export/test/src@2018-10-11-225226
[Thu Oct 11 23:02:37 2018] [debug] now will look if there is anything to clean in children of rpool/export/test/src
# zfs list -H -o name -t snapshot -s creation -d 1 rpool/export/test/src/child
[Thu Oct 11 23:02:37 2018] [info] done with backupset rpool/export/test/src in 1 seconds
[Thu Oct 11 23:02:37 2018] [debug] send/receive worker for rpool/export/test/src done (1752)
are you cleaning recursively in any case or only for filesets which have recursive enabled ?
Deployed this change to our Solaris 10 server (backported to cswznapzend release), no data seems eaten :)
Difference in timing for comparable resync jobs was dramatic, especially where ZFS trees were big and branchey; here's a run with old code a couple of days ago and with new one today for 3 different trees:
/var/tmp/znap.log:real 15m35.561s
/var/tmp/znap.log:user 0m2.046s
/var/tmp/znap.log:sys 0m10.186s
/var/tmp/znap2.log:real 6m24.706s
/var/tmp/znap2.log:user 0m1.682s
/var/tmp/znap2.log:sys 0m10.616s
###
/var/tmp/znap.log:real 1509m31.119s
/var/tmp/znap.log:user 1m56.800s
/var/tmp/znap.log:sys 4m5.472s
/var/tmp/znap2.log:real 73m36.980s
/var/tmp/znap2.log:user 0m23.250s
/var/tmp/znap2.log:sys 1m39.852s
###
/var/tmp/znap.log:real 0m59.732s
/var/tmp/znap.log:user 0m1.306s
/var/tmp/znap.log:sys 0m4.248s
/var/tmp/znap2.log:real 0m26.057s
/var/tmp/znap2.log:user 0m0.866s
/var/tmp/znap2.log:sys 0m1.819s
Note that just the loop of listing remaining snapshots in recursive child datasets one by one maybe should have been optimized to be one zfs
and later destroySnapshots()
call too: just looking to find nothing to do took about 6 minutes.
And yes, the recursive cleanup mode currently only kicks in (both for destinations and the source) if the source dataset policy was with recursive backup enabled.
So with this patch on the server for a week, it has not suffered any unexpected losses :) and manages to do its snap/sync/cleanup loops mostly... with a 2hr=>1hr policy it sees some 3-4 snapshts in the source (where I think +1 is by design, so 3 snaps in place are okay), not a couple of days of backlog like when we had 1day=>1hr and no recursive destroy.
Also read today about an ability to have recursive backup plan, but have it not enabled on some child datasets. I wonder if this recursive mass-removal would contradict anything (e.g. cleaning up datasets that we did not intend to actively back up? then it's good... Or to touch at all? then it's bad...)
Closing the issue as the solution proposed in #386 got merged today :)
The behavior I see currently in practice, and in code, is that after the snapshots are made and sent, they are destroyed - as one huge list of arguments by default or one by one in
oracleMode
. Our setup uses an extensive tree of datasets with a recursive znapzend policy starting from a low-hanging branch of the pool, so there are typically thousands of snaps to delete and processing each one takes the host several seconds for all the synchronous contexts... so the loop just does not keep up doing the job, and we run out of space due to obsolete snapshots eating it up with referenced unneeded bits.Note 1: if I use backgrounded
zfs destroy
commands from a loop and/or a recursive destroy, it takes as long to destroy hundreds of snapshot datasets as it takes to do one, so there ought to be some common big lock there effectively caching and coalescing these requests.Note 2: in fact the processing of blocks getting freed up is background and asynchronous on even later Solaris 10 releases and illumos nowadays, and can take minutes after the zfs CLI commands have completed and returned. But getting those blocks onto the hit-list takes a considerable while.
I see that creation of snapshots takes the ZFS recursion support into account at https://github.com/oetiker/znapzend/blob/c604a86857430258c2b8479c356437c0f61a4dc6/lib/ZnapZend/ZFS.pm#L221 but removal does not seem to: https://github.com/oetiker/znapzend/blob/c604a86857430258c2b8479c356437c0f61a4dc6/lib/ZnapZend/ZFS.pm#L235
My suggestion is to start the cleanup phase with a quick recursive deletion of the specified snapshot name from the root branch with an individual local znapzend zetup, especially if we know that it was making recursive snapshots in the first place. Then we can follow up with the existing logic to find possibly missed snapshots in child datasets that should also be removed. Hopefully this latter part would usually have nothing to do.
I'll try to give this idea a run in our deployments before PRing, but comments are welcome in general :)