Closed jimklimov closed 1 year ago
Looking at the regex, it seems like trying to catch the port is indeed the culprit (for the local path, without any hostspec in the string)
Eh, not port, just a separator I guess. Playing at https://regex101.com/r/6wXZhu/1 this expression splits root@localhost:rpool/zones/nut/jenkins-worker/ROOT/openindiana-2022-03-15-backup-1
into two groups - root@localhost
and the dataset path (note no colons in test path; regex fails if there are any).
Off-topic: Wondering now how it would pass a non-22 SSH port? Is there some option or just ~/.ssh/config
should be used to define params for a "hostname"? Is that documented so then? (Just quickly wondering, did not check)
Commit 58fbfc4f50ff55160d232113a43d781b4c767e1e seems to have last substantially changed the regex as
bugfix: ':' character in snapshot name does now work
-my $splitHostDataSet = sub { return ($_[0] =~ /^(?:([^:]+):)?([^:]+)$/); };
+my $splitHostDataSet = sub { return ($_[0] =~ /^(?:([^:\/]+):)?([^:]+|[^:@]+\@.+)$/); };
First it seemed like simplifying this to
^(?:([^:\/]+):)?(.+)$
would help
But there is a corner case with a pool root dataset (no slashes) and tsformat with colons:
rpool@snaptime-12:34
How does one confidently tell this apart from user rpool
on ssh server snaptime-12
accessing (root) dataset 34
? :\
(more so if that pool also has colons in the name like 34:56
?)
I think use of colon to separate $remote:$dataSet
was not such a good idea :-D Can it be replaced by e.g. whitespace? Is it an internal implementation detail or something users (or third-party code) expect?
UPDATE: Silly me; it is the [[user@]host:]dataset
documented for DST
in znapzendzetup
... But still, the earlier posted ideas remain:
^(?:([^\s\/]+)\s)?(.+)$
works to split correctly into two groups this string:
root@host rpool@snaptime-12:34:56
and even permits (syntactically) to add a port number, if we wished to handle it later on:
root@host:22 rpool@snaptime-12:34:56
and still makes an (undef, dataset)
from plain rpool@snaptime-12:34:56
for local paths without a remote
It does collapse to one second group grabbing whole input string if there is a slash in first token, like "user@host:22/tcp" but:
Fiddling more with it, this seems better viable: https://regex101.com/r/ambWtC/1
^(?:(.+)\s)?([^\s]+)$
Parses this:
root@host -p 22 -id /tmp/key rpool@snaptime-12:34:56
into usable args for ssh
: group 1 == root@host -p 22 -id /tmp/key
(which is everything until last whitespace)
and the dataset (group 2 which is just the last long non-whitespace token)
And it still works for just the local dataset name (then group1 = undef)
Scratch that last shot for now, has some issue to track down with specs like remotehost:pond/dataset/name
:\
Also the premise of root@host -p 22 -id /tmp/key rpool@snaptime-12:34:56
in the later posts was wrong - ~/.ssh/config
should pass the options, so the DST value to parse is as documented in e.g. znapzendzetup --help
:
[[user@]host:]dataset
with complexity of what chars dataset
may contain validly (and current regex does not cover that all), including obscene cases like say dataset
being named c:
(or pond/c:data
) because why not, making it hard to discern from the user@host:
prefix (especially when there is none and DST is local).
So the winner currently was a slight mod of the original:
my $splitHostDataSet = sub {
- return ($_[0] =~ /^(?:([^:\/]+):)?([^:]+|[^:@]+\@.+)$/);
+ return ($_[0] =~ /^(?:([^:\/]+):)?([^@\s]+|[^@\s]+\@[^@\s]+)$/);
In my
dmesg
I see a series of messages in SUBJ, grouped as a burst of 10 and another 7 half a minute later.Current codebase has it a bit shifted by line numbers, but otherwise this area seems same as installed:
https://github.com/oetiker/znapzend/blob/27f965ff84fc496d5d2c93bd74b2a745666633fa/lib/ZnapZend/ZFS.pm#L209-L215
This is in
listSnapshots
, and looking at debug output and the code, I believe it is trying to parse a dataset name with colons (the only thing that apparently differs from successful logs) andsplitHostDataSet
fails to do it properly (can colons be treated as e.g. port separator for general-case SSH remote?) Colons in dataset names are injected by current OpenIndianapkg
handling of backup rootfs'es.https://github.com/oetiker/znapzend/blob/27f965ff84fc496d5d2c93bd74b2a745666633fa/lib/ZnapZend/ZFS.pm#L40-L42
Afterwards per symptoms,
$dataSet
isundef
and operations with it emit warnings (two actually, just one in dmesg though) until it fails in call tozfs
program where it specifies no dataset to list: