Closed crabique closed 3 years ago
making znapzend faster is always a good thing, how well this works is beast to be judged based on actual implementation → PR welcome!
Seems this use-case could also benefit from a solution to #438 that would reduce the amount of spawned processes. Also using several backup schedules for smaller subtrees allows to parallelize already.
For a straightforward parent/child layout where the children have no local znapzend config, seems like #438 is the better approach. For all other use cases rather than specifying this in a filesystem setup it'd be nice to be able to tell znapzend daemon:
--parallel=N
--mbuffer=/usr/bin/mbuffer:FIRSTPORT
where --parallel=N
sets up to N parallel send/recvs and --mbuffer=...:FIRSTPORT
would have mbuffer
us the port range FIRSTPORT
to FIRSTPORT+N
. If a filesystem specifies --mbuffer=
then it gets handled serially, all others get handled with up to N in parallel. A filesystem could have --mbuffer=NONE
or --mbuffer=
with value empty to disable the daemon level mbuffer
config and fall back to only ssh.
I seem to have the opposite problem, whereby I have a lot of replicated datasets both within a system and to a remote system. When these run, they cause performance issues on the system and I'd like to be able to limit them or schedule them to not run at the same time. I only write this here, because it may be that by implementing a number of parallel streams, we could use the same code to restrict the number of parallel streams.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi!
We are using
znapzend
to backup a dataset with a high number of datasets inside it (>10k), making and sending snapshots daily. We are also using mbuffer port option.Unfortunately, as the number of datasets grew we noticed huge running times for the backup tasks, about 5-6 times slower compared to our other server where we have the same amount of data but it's all one big dataset without recursion.
Upon closer inspection, we found that znapzend spawns a new
ssh
process for every child dataset and it was slow. To speed it up a little, we added ssh multiplexing options to .ssh/config so that it at least re-uses the SSH connection.This was not enough and it was still going very slow, because it still takes some time to spawn an instance of
znapzend
that spawns anssh
process that spawns anmbuffer | zfs recv
on the recv end, then actually transfer even an empty snapshot (on average ~3.5 seconds).So the feature request is as follows: since recursive send is more "atomic", it could be possible to send datasets in parallel on different
mubffer
ports. For example, the configuration could look like that:Which would mean there should be 4 znapzend workers each sending and receiving snapshots on those ports in parallel.
Please let me know what you think about this.