Closed jimklimov closed 5 months ago
At least, can confirm the observed (may be not "desired") behavior in codebase.
I see it prepare a @cmd
with generic zfs send
(no mbuffer
) at: https://github.com/oetiker/znapzend/blob/7c7565970e74ed165bc4a8b79839c4e46cebbfbb/lib/ZnapZend/ZFS.pm#L710-L716
and then if sending port-to-port (not in SSH tunnel) then it appends a | mbuffer -O port
to sender and prepends a mbuffer -I port |
to receiver. Using the same $mbuffer
path to binary (very much not a given with current cross-platform probabilities): https://github.com/oetiker/znapzend/blob/7c7565970e74ed165bc4a8b79839c4e46cebbfbb/lib/ZnapZend/ZFS.pm#L718-L757
or if sending in SSH tunnel, the @mbCmd
is prepended between $remote
and $recvCmd
(is not part of sender processes): https://github.com/oetiker/znapzend/blob/7c7565970e74ed165bc4a8b79839c4e46cebbfbb/lib/ZnapZend/ZFS.pm#L782-L794
And per git blame, this remote-ness of mbuffer
goes from the first commits (v0.0.1): https://github.com/oetiker/znapzend/blob/16467ee623bae2fbd373dc7b38d7918992e38114/lib/ZnapZend/ZFS.pm#L60-L76 and explicit "check if executable is available on remote host" at https://github.com/oetiker/znapzend/blob/16467ee623bae2fbd373dc7b38d7918992e38114/lib/ZnapZend/Config.pm#L125-L130
So to minimize surprises in the field, any change here should honour that singular setting of mbuffer
path name (used for each destination and for sender in port-to-port mode), unless overridden by newly defined src_*
and dst_N_*
variants, and documented as deprecated...
At least, this is what I see in practice, per znapzend logs (wrapped for readability) e.g.:
...although documentation examples (in
znapzendzetup
embedded man page) seem to imply that this is (was originally?) about the sender's localmbuffer
:On one hand, having it remote-only adds constraints on present software (and run-time resources like RAM) of the destination host(s).
On another, if the main goal of
mbuffer
is to level out the burstiness of original ZFSsend
stream generation (and/or, to an extent, of its consumption on the other side) - so sender is not always blocked on the receiver and vice versa - then thembuffer
may as well run on the source system (assuming network speed roughly a constant).Running the buffer on sender also allows for a more predictable use of RAM (sender may control how many streams it is sending and how large their buffers are sized, but may not control how many different systems are currently backing up into the same destination server and the impact of the many buffers spawned only there).
In fact, with manual replications I often end up having both (to level out network lags):
zfs send | mbuffer | ssh "mbuffer | zfs recv"
This issue is posted to begin a discussion about perhaps adding another group of settings (
src_mbuffer
andsrc_mbuffer_size
?) to optionally use that instead of (or in addition to) anmbuffer
on the destination system.Technically, it could be more correct to track independent
dst_N_mbuffer(_size)
settings and keep the current one for source, but this might break some deployments upon upgrade?..Finally note that there may be local destinations, and running two mbuffer's talking to each other on the same host is an overkill. Although... if the user's
znapzendzetup
calls for it? Maybe warn, but honour their choice.