Closed jimklimov closed 4 years ago
Caveat found : with the current design of znapzendzetup
, the "root options" including --features
must immediately follow the $mainOpt
(e.g. list
), and only then the CLI args relevant for this mainOpt.
So znapzendzetup list --sudo --features=zfsGetType --debug --recursive rpool/SHARED
works as expected, while znapzendzetup list --debug --recursive --features=zfsGetType rpool/SHARED
does not (keeps zfsGetType
not-set).
For a feel of the (kernel) time and text-size trade-offs running these different options, on moderately recent OI SunOS jimoi 5.11 illumos-052042909d i86pc i386 i86pc illumos
after priming the ARC with data (running the selections a few times over so timing is consistent - no storage I/O involved):
root@jimoi:/root# time zfs get -H -r -o name,property,value,source all nvpool/SHARED | wc -l
340273
real 0m31.096s user 0m2.300s sys 0m28.383s
* Constrain to only looking at non-snapshot datasets - `zfs` command completes a lot faster, and outputs less text:
root@jimoi:/root# time zfs get -H -t filesystem,volume -r -o name,property,value,source all nvpool/SHARED | wc -l 2301
real 0m0.186s user 0m0.028s sys 0m0.169s
* Constrain to only looking at non-snapshot datasets AND the "local" source of the attribute, as znapzend does - `zfs` command completes about as fast, and finds just the one dataset which has the backup plan (11 lines):
root@jimoi:/root# time zfs get -H -s local -t filesystem,volume -r -o name,property,value,source all nvpool/SHARED | wc -l 11
real 0m0.208s user 0m0.025s sys 0m0.200s
* After all of the above, re-run the constraint to "local" source but no constraint to dataset type - about as slow as the topmost no-constraints case (a bit faster since less text is pushed around):
root@jimoi:/root# time zfs get -H -s local -r -o name,property,value,source all nvpool/SHARED | wc -l 11
real 0m28.466s user 0m1.966s sys 0m26.428s
ZFS-tree-wise, here are the counts of different dataset types involved:
root@jimoi:/root# time zfs list -Honame -t snapshot -r nvpool/SHARED | wc -l 8925
real 0m10.770s user 0m1.690s sys 0m9.055s
root@jimoi:/root# time zfs list -Honame -t volume -r nvpool/SHARED | wc -l 0
real 0m0.136s user 0m0.025s sys 0m0.127s
root@jimoi:/root# time zfs list -Honame -t filesystem -r nvpool/SHARED | wc -l 31
real 0m0.125s user 0m0.017s sys 0m0.118s
Similar timing on Solaris 10u8 - there is no filtering by dataset type (the crux of this PR) :
[root@x4100 ~]# time zfs get -H -r -o name,property,value,source all rpool/SHARED | wc -l
9219
real 0m4.590s
user 0m0.109s
sys 0m1.638s
[root@x4100 ~]# time zfs get -H -s local -r -o name,property,value,source all rpool/SHARED | wc -l
12
real 0m3.335s
user 0m0.092s
sys 0m1.217s
[root@x4100 ~]# time zfs get -H -t filesystem,volume -r -o name,property,value,source all rpool/SHARED | wc -l
invalid option 't'
usage:
get [-rHp] [-d max] [-o field[,...]] [-s source[,...]]
<"all" | property[,...]> [filesystem|volume|snapshot] ...
The following properties are supported:
...
Amounts of datasets involved:
[root@x4100 ~]# time zfs list -Honame -t snapshot -r rpool/SHARED | wc -l
469
real 0m0.347s
user 0m0.071s
sys 0m0.092s
[root@x4100 ~]# time zfs list -Honame -t filesystem -r rpool/SHARED | wc -l
7
real 0m0.014s
user 0m0.005s
sys 0m0.010s
[root@x4100 ~]# time zfs list -Honame -t volume -r rpool/SHARED | wc -l
0
real 0m0.013s
user 0m0.005s
sys 0m0.010s
Filtering with egrep
to see the "live" datasets only gets a lot less text to parse:
[root@x4100 ~]# zfs get -H -r -o name,property,value,source all rpool/SHARED | egrep '^[^@]*[ \t]' | wc -l
270
And timing of this PR's code on OI:
root@jimoi:/root# time ~jim/shared/znapzend/bin/znapzendzetup list --features=zfsGetType -r nvpool/SHARED | wc -l
cannot open 'backup-adata/snapshots/nvpool/SHARED': dataset does not exist
*** WARNING: destination 'backup-adata/snapshots/nvpool/SHARED' does not exist, will be ignored! ***
NOTE: if you have modified your configuration, send a HUP signal (pkill -HUP znapzend) to your znapzend daemon for it to notice the change. 14
real 0m0.441s user 0m0.189s sys 0m0.260s
* Backwards-compatible by default, but slower:
root@jimoi:/root# time ~jim/shared/znapzend/bin/znapzendzetup list -r nvpool/SHARED | wc -l cannot open 'backup-adata/snapshots/nvpool/SHARED': dataset does not exist WARNING: destination 'backup-adata/snapshots/nvpool/SHARED' does not exist, will be ignored!
NOTE: if you have modified your configuration, send a HUP signal (pkill -HUP znapzend) to your znapzend daemon for it to notice the change. 14
real 0m29.772s user 0m2.200s sys 0m27.498s
Finally note this timing difference is only expected to impact the case of recently added (and explicitly requested via CLI) recursive search for backup plans.
The other recently added mode, which only looks at the specified dataset(s) looks into them directly without recursing further (it may suffer on Solaris 10 with the master branch before this PR though, as it does `get -t` only). The very default mode of looking at all filesystem or volume datasets is similar: it lists them first and then looks at the list items one by one. I timed these to perform similarly (~19s on the OI system above for the full search with and without the new feature).
how about adding a note into the documentation about the motivation ... performance gain for this option.
Hm... my earlier posting stands corrected. Solaris 10u8 does list the properties for snapshots with that zfs get -H -r -o name,property,value,source all rpool/SHARED
request above, but yet does not allow to filter them away right in the zfs
tool.
This research also led to an improvement proposal, to filter away snapshots if our zfs get
call does not as soon as possible, to spend less cycles processing those lines. Seems like this can also prevent errors (passing snapshot names to routines that expect a "live" dataset) if nothing else.
From cursory manual testing, seems this filtering part is relevant when not just recursing datasets, but when asked to look at inherited properties.
The latest red-cross in CI is just ETOOFEWMOCKS ... I'm not gonna fight this bit in t/zfs
and the tests (to have zfs get
return some snapshots too) today, sorry ;)
Trying a fix for the lack of mocks. Found a small omission that acts like a bug, too.
Hm... the latest improvements to tests and code broke something... again :(
Marking as "WIP" to figure out the test fails...
Why oh why?..
In the test I set the envvar into ENV
array, then undef
it. From the test program it is no longer defined. But from the called zfs
it is defined, just without value :\
So where it is intended, it works:
znapzend.t: Now ZNAPZENDTEST_ZFS_GET_TYPE_UNHANDLED = 1
znapzend.t: Is defined? Y
[Fri Nov 22 17:36:59 2019] [info] znapzend (PID=6311) starting up ...
[Fri Nov 22 17:36:59 2019] [info] refreshing backup plans for dataset "tank/source" ...
# zfs get -H -s local -t filesystem,volume -o name,property,value,source all tank/source
zfs: Now ZNAPZENDTEST_ZFS_GET_TYPE_UNHANDLED = 1
zfs: Is defined? Y
invalid option 't'
usage:
...
When it is not intended... oh well... :
ok 3 - znapzend --features=zfsGetType --runonce=tank/source fails with old ZFS
Use of uninitialized value $ENV{"ZNAPZENDTEST_ZFS_GET_TYPE_UNHAND"...} in concatenation (.) or string at ./t/znapzend.t line 188.
znapzend.t: Now ZNAPZENDTEST_ZFS_GET_TYPE_UNHANDLED =
znapzend.t: Is defined? N
[Fri Nov 22 17:36:59 2019] [info] znapzend (PID=6311) starting up ...
[Fri Nov 22 17:36:59 2019] [info] refreshing backup plans for dataset "tank/source" ...
# zfs get -H -s local -t filesystem,volume -o name,property,value,source all tank/source
zfs: Now ZNAPZENDTEST_ZFS_GET_TYPE_UNHANDLED =
zfs: Is defined? Y
invalid option 't'
usage:
get [-rHp] [-d max] [-o "all" | field[,...]] [-s source[,...]]
<"all" | property[,...]> [filesystem|volume|snapshot] ...
EXCEPTION: No backup set defined or enabled, yet. run 'znapzendzetup' to setup znapzend
not ok 4 - znapzend --features=zfsGetType --runonce=tank/source succeeds with new ZFS when it lists snapshots
Printing done by instrumenting this into both files:
print STDERR "znapzend.t: Now ZNAPZENDTEST_ZFS_GET_TYPE_UNHANDLED = " . $ENV{'ZNAPZENDTEST_ZFS_GET_TYPE_UNHANDLED'} . "\n";
print STDERR "znapzend.t: Is defined? " . ( (defined $ENV{'ZNAPZENDTEST_ZFS_GET_TYPE_UNHANDLED'}) ? "Y" : "N" ) . "\n";
Hopefully changing such envvar handling into non-empty (or specific 1
value) checks would fix this.
Green and covered, is there something more to fix here? :)
Green and covered, is there something more to fix here? :)
if you think a PR is done, just click on the review button ... :)
Found an issue that on my OI I needed to filter the recursive
zfs get
output (got all snapshots listed) so introducedget -t
to filter. But now saw that on my Solaris 10 systems this flag is not supported.CC @oetiker and @lotheac for a better look :)