The `mbuffer` settings relate to the remote system only, is this right?

At least, this is what I see in practice, per znapzend logs (wrapped for readability) e.g.:

# zfs send -Lce -I 'rpool/home/abuild@znapzend-auto-2024-01-16T00:00:00Z' \
    'rpool/home/abuild@znapzend-auto-2024-01-16T11:51:05Z'\
    |ssh -o batchMode=yes -o ConnectTimeout=30 znapzend \
        'mbuffer -q -s 256k -W 600 -m 128M\
        |zfs recv -u -F pond/export/DUMP/NUTCI/znapzend/ci-deb/rpool/home/abuild'

...although documentation examples (in znapzendzetup embedded man page) seem to imply that this is (was originally?) about the sender's local mbuffer:

Specify the path to your copy of the mbuffer utility.
Specify the path to your copy of the mbuffer utility and the port used on the destination. Caution: znapzend will send the data directly from source mbuffer to destination mbuffer, thus data stream is not encrypted.
znapzendzetup create --recursive --mbuffer=/opt/omni/bin/mbuffer ...

On one hand, having it remote-only adds constraints on present software (and run-time resources like RAM) of the destination host(s).

On another, if the main goal of mbuffer is to level out the burstiness of original ZFS send stream generation (and/or, to an extent, of its consumption on the other side) - so sender is not always blocked on the receiver and vice versa - then the mbuffer may as well run on the source system (assuming network speed roughly a constant).

Running the buffer on sender also allows for a more predictable use of RAM (sender may control how many streams it is sending and how large their buffers are sized, but may not control how many different systems are currently backing up into the same destination server and the impact of the many buffers spawned only there).

In fact, with manual replications I often end up having both (to level out network lags): zfs send | mbuffer | ssh "mbuffer | zfs recv"

This issue is posted to begin a discussion about perhaps adding another group of settings (src_mbuffer and src_mbuffer_size?) to optionally use that instead of (or in addition to) an mbuffer on the destination system.

Technically, it could be more correct to track independent dst_N_mbuffer(_size) settings and keep the current one for source, but this might break some deployments upon upgrade?..

Finally note that there may be local destinations, and running two mbuffer's talking to each other on the same host is an overkill. Although... if the user's znapzendzetup calls for it? Maybe warn, but honour their choice.

At least, can confirm the observed (may be not "desired") behavior in codebase.

I see it prepare a @cmd with generic zfs send (no mbuffer) at: https://github.com/oetiker/znapzend/blob/7c7565970e74ed165bc4a8b79839c4e46cebbfbb/lib/ZnapZend/ZFS.pm#L710-L716
and then if sending port-to-port (not in SSH tunnel) then it appends a | mbuffer -O port to sender and prepends a mbuffer -I port | to receiver. Using the same $mbuffer path to binary (very much not a given with current cross-platform probabilities): https://github.com/oetiker/znapzend/blob/7c7565970e74ed165bc4a8b79839c4e46cebbfbb/lib/ZnapZend/ZFS.pm#L718-L757
or if sending in SSH tunnel, the @mbCmd is prepended between $remote and $recvCmd (is not part of sender processes): https://github.com/oetiker/znapzend/blob/7c7565970e74ed165bc4a8b79839c4e46cebbfbb/lib/ZnapZend/ZFS.pm#L782-L794

And per git blame, this remote-ness of mbuffer goes from the first commits (v0.0.1): https://github.com/oetiker/znapzend/blob/16467ee623bae2fbd373dc7b38d7918992e38114/lib/ZnapZend/ZFS.pm#L60-L76 and explicit "check if executable is available on remote host" at https://github.com/oetiker/znapzend/blob/16467ee623bae2fbd373dc7b38d7918992e38114/lib/ZnapZend/Config.pm#L125-L130

So to minimize surprises in the field, any change here should honour that singular setting of mbuffer path name (used for each destination and for sender in port-to-port mode), unless overridden by newly defined src_* and dst_N_* variants, and documented as deprecated...

oetiker / znapzend

The `mbuffer` settings relate to the remote system only, is this right? #629