oetiker / znapzend

zfs backup with remote capabilities and mbuffer integration.
www.znapzend.org
GNU General Public License v3.0
603 stars 136 forks source link

Optionally email the admin if a send task failed and we suspend the source cleanup #520

Closed jimklimov closed 3 years ago

jimklimov commented 3 years ago

Should send the error summary to specified recipient(s) and fix the issue #499 as far as I (original poster) am concerned :)

Tested on a local system, delivers to one or two (comma-separated) local mailboxes using the standard /usr/lib/sendmail or /usr/sbin/sendmail alias (exim, postfix also provide that interface). Internet delivery is up to MTA setup and outside znapzend's interest.

:; znapzend --noaction --runonce=nvpool/SHARED/var/test --inherited --debug --cleanOffline --mailErrorSummaryTo=root,jim

...
[2020-08-20 00:29:08.80293] [26384] [warn] ERROR: 1 send task(s) below failed for nvpool/SHARED/var/test, but "cleanOffline" mode is on, so proceeding to clean up source dataset carefully:
[2020-08-20 00:29:08.80315] [26384] [warn]  +-->   destination 'backup-adata/snapshots/nvpool/SHARED/var/test' does not exist or is offline. ignoring it for this round...
[2020-08-20 00:29:08.80643] [26384] [warn] Sending a copy of the report above to root,jim
[2020-08-20 00:29:08.83905] [26384] [debug] checking to clean up snapshots recursively from source nvpool/SHARED/var/test
...

If the mail program does not exist in standard path (currently hardcoded, can be made an argument later), it is noted but not fatal:

[2020-08-20 00:27:28.54081] [26359] [warn] ERROR: 1 send task(s) below failed for nvpool/SHARED/var/test, but "cleanOffline" mode is on, so proceeding to clean up source dataset carefully:
[2020-08-20 00:27:28.54089] [26359] [warn]  +-->   destination 'backup-adata/snapshots/nvpool/SHARED/var/test' does not exist or is offline. ignoring it for this round...
Can't exec "/usr/sbin/sendmailsss": No such file or directory at /export/home/jim/shared/znapzend/bin/../lib/ZnapZend.pm line 786.
Can't open /usr/sbin/sendmailsss to send a copy of the report above to root,jim!

Message in the box:

From jim@jimoi.local Thu Aug 20 00:29:08 2020
Return-Path: <jim@jimoi.local>
Received: from jimoi.local (jimoi [127.0.0.1])
        by jimoi.local (8.15.2+Sun/8.15.2) with ESMTP id 07JMT880026388;
        Thu, 20 Aug 2020 00:29:08 +0200 (CEST)
Received: (from jim@localhost)
        by jimoi.local (8.15.2+Sun/8.15.2/Submit) id 07JMT8wN026387;
        Thu, 20 Aug 2020 00:29:08 +0200 (CEST)
Date: Thu, 20 Aug 2020 00:29:08 +0200 (CEST)
From: jim <jim@jimoi.local>
Message-Id: <202008192229.07JMT8wN026387@jimoi.local>
To: root@jimoi.local, jim@jimoi.local
Subject: znapzend replication error summary
Content-Length: 290

-------
ERROR: 1 send task(s) below failed for nvpool/SHARED/var/test, but "cleanOffline" mode is on, so proceeding to clean up source dataset carefully:
 +-->   destination 'backup-adata/snapshots/nvpool/SHARED/var/test' does not exist or is offline. ignoring it for this round...
-------
jimklimov commented 3 years ago

Bump? This change is in our production for a month now, spams well :)

One missing thing for admins not at console would be to get more detail at HOW zfs send failed (e.g. copy of stderr from the command at least, contemporary dmesg snippet at best) but I think it can be done in another PR. Probably needs some Mojo magic that I don't yet know, to grab that zfs output...