Open chris001 opened 6 years ago
This is a nice feature idea, but honestly not one we are likely to get to any time soon - in fact, scanning logs for to detect serious errors could be an entire product in itself :-) (for example, Splunk) The reason it's hard is that logs of most servers are full of errors due to bad logins or invalid email destinations, so separating those out from the real unrecoverable errors requires a detailed knowledge of each server's behavior and log format (which likely changes with every release).
Probably a one line shell command, PER SERVICE (apache, nginx, postifx, proftpd, mysql, postgresql, etc), would get the most glaring fatal errors, which flat out prevent Dovecot from loading and serving clients whatsoever.
Something like:
tail -100 /var/log/mail.err | grep Fatal
Oct 30 11:34:41 server1 dovecot: imap-login: Fatal: Can't set cipher list to 'ECDHE-RSA-AES256-SHA384:AES256-SHA256:AES256-SHA256:RC4:HIGH:MEDIUM:+TLSv1:+TLSv1_1:+TLSv1_2:!LOW:!MD5:!SSLv2:!SSLv3:!ADH:!aNULL:!eNULL:!NULL:!DH:!ADH:!EDH:!AESGCM': error:140E6118:SSL routines:SSL_CIPHER_PROCESS_RULESTR:invalid command
Oct 30 11:43:18 server1 dovecot: doveadm: Fatal: This is Dovecot's fatal log (1509378197)
Oct 30 12:05:07 server1 dovecot: auth: Fatal: Unknown userdb driver 'pam'
Oct 30 12:08:13 server1 dovecot: auth: Fatal: CRAM-MD5 mechanism can't be supported with given passdbs
Oct 30 12:09:09 server1 dovecot: auth: Fatal: DIGEST-MD5 mechanism can't be supported with given passdbs
I'd have to do some more investigation to see if this method is reliable enough to cover all fatal errors..
Showing the service up/down status is good, but it's not good enough. The virtualmin system info page should also alert you of errors causing critical services to silently fail. The perfect example I'm now facing is a silent failure of
dovecot
imap server. All users fail to connect over IMAP and consequently all users are unable to read all mailboxes!!