Closed faern closed 2 years ago
Not getting errors because of occasional network errors might be difficult. :)
However, you can use --ignore-transfer-errors
to make it not check the exit code during transfer. It will still verify if the snapshot actually exists on the target.
If you want better monitoring check out https://github.com/psy0rz/zfs_autobackup/wiki/Monitoring#monitoring-example-with-zabbix-jobs
Also: Try this: https://github.com/psy0rz/zfs_autobackup/wiki/Performance#speeding-up-ssh
It usually makes things more reliable and faster.
Thank you for the input and ideas! I'll read up on that.
Interestingly enough, setting up ControlPath
/ControlMaster
silenced the errors completely. Knock on wood, but previously I got 3-4 per day, now I have not seen a single such error email in a few days.
Yes that was probably it. This also happens if an ssh port is exposed to the internet and is getting hammered by all kinds of bots and scripts. Every so often sshd will then refuse a connection.
(or if sshd or a firewall just things you're reconneting to ssh too often)
I recently upgraded
zfs-autobackup
from version ~3.0 (I can't remember exactly which version, but it was a release candidate) to 3.1.3. And I also upgraded zfs on my source machine at the same time. But since this error is about the target, I think that's irrelevant.Now I get this error fairly frequently. Not on every run of the script, but a few times per day or something (I run it every hour). I have substituted the domain and name of the dataset:
I don't think my remote suddenly started doing this anything differently. So I assume it's zfs-autobackup that is checking the exit code more strictly. The man page for
zfs list
does not specify exit codes. But ssh exits with code 255 if an error occurred, so I assume it's that. Should maybezfs-autobackup
be able to handle that more gracefully? An error did indeed happen, but I need a way to not get cron daemon error emails because of occasional network errors :thinking: