oetiker / znapzend

zfs backup with remote capabilities and mbuffer integration.
www.znapzend.org
GNU General Public License v3.0
604 stars 136 forks source link

Too many znapzend destinations cause SSH error #582

Open nahall opened 1 year ago

nahall commented 1 year ago

I've been using znapzend for many years and have 17 backup plans, each going to 2 destinations on the same server, so 34 connections to the same server. I recently added "--mailErrorSummaryTo=root" to my config and found that several times per day I was getting errors on random plans. But many times it completed successfully.

The znapzend debug log contained

kex_exchange_identification: read: Connection reset by peer Connection reset by 10.20.0.33 port 22

I checked the server's auth.log for the same time and found when this happened:

sshd[4301]: error: beginning MaxStartups throttling sshd[4301]: drop connection #10 from xxxxx on xxxxx past MaxStartups

The default MaxStartups for sshd is 10:30:100, so if more than 10 connections are attempted to be established simultaneously, it drops additional ones with a 30% probability. So that is why it was working in some cases but not others.

I have subsequently modified the server's /etc/ssh/sshd_config to contain:

MaxStartups 40:30:100

This seems to have fixed the problem, however, it is not the ideal solution to have to modify a server's sshd_config that isn't even running znapzend. It seems like it would be better to modify znapzend so it doesn't attempt to open every ssh connection simultaneously.

Before I added --mailErrorSummaryTo it appears as though this happened several times every day, but I just didn't notice because the backup would then usually be successful within an hour or two, due to the randomness of MaxStartups. But adding that error notification has brought this issue to light.