Open anarcat opened 6 years ago
Good question. I've changed the subtitle, implying that logcheck "sucks" isn't fair.
I spent so much time tuning logcheck rules by hand, which is a really tedious process. So I started thinking about the issue.
What logcheck basically does is applying a blacklist of known-harmless messages (or rather, regex for them), throws them onto the list of recent log messages and reports all messages that aren't filtered out by the blacklist.
The problem here is tending the list of regex, it has several problems:
So, my idea is to replace the logcheck config file (one regex per line) with something more suitable.
It starts in the erpel.conf
config file, which can define global "fields":
https://github.com/fd0/erpel/blob/c5c77306c71b8d1307c98a8645a2e3a354d032c9/doc/erpel.conf#L13-L20
As you can see the field IP
is defined, which matches an IPv4 or IPv6 address, and has a name of IP
and a template of 1.2.3.4
. It also includes two samples for IP addresses for which the regex must match.
Then there's a sample rule file for the dovecot IMAP/POP3 server, defining another field mailaddress
with a template of user@domain.tld
:
This field is used in the next section, which contains the rules, one rule is:
https://github.com/fd0/erpel/blob/c5c77306c71b8d1307c98a8645a2e3a354d032c9/doc/rules.d/dovecot#L43
As you can see, in contrast to logcheck the rule consists of a sample message without any regex in it, so it is readable. So, the goal here is to make the rules as simple and as readable as possible. Compare that with a typical logcheck rule for dovecot:
^\w{3} [ :[:digit:]]{11} [._[:alnum:]-]+ dovecot: (pop3|imap)-login: Login: user=<[-_.@[:alnum:]]+>, method=[[:alnum:]-]+, rip=[.:[:xdigit:]]+, lip=[.:[:xdigit:]]+(, (TLS( handshake)?|secured))?$
When erpel starts, it creates a regex for this rule by taking the literal string, escaping all special characters in it, then replacing all templates of all fields by the regex in the field definitions. During development, the erpel show
command is helpful, which displays the parsed and constructed rules, hilighting the fields it found with colors and replacing the templates with the field name:
It can also show the templates instead, highlighted with colors:
Besides the field definitions and the rules, there's a third section in an erpel
rule file, which contains sample messages that must be matched by the rules. This is built so that erpel
can instantly print an error message when a message which was previously ignored now does not match the rules any more. Also, the fields are checked for consistency, each field needs to match all the samples in the definition.
So the workflow is something like this:
erpel show
until the sample message matchesDoes that sound reasonable? I'm using erpel
for a few of my servers, but did not spent the time yet polishing it. Adding new messages could also be partly automated, by using the regex for fields to pre-replace them with the templates or so.
Does that sound reasonable? I'm using erpel for a few of my servers, but did not spent the time yet polishing it. Adding new messages could also be partly automated, by using the regex for fields to pre-replace them with the templates or so.
Yes! That seems like a great idea, although i'm not sure of the implications of the template names... I think it's a great idea to have a kind of a unit test section for the rules...
semi-automatic rule generation when false positive come up would be a killer feature, for sure.
I encourage you to add the above as documentation in the README... maybe just a pointer here? i guess that would mean adding the images in the git repo though...
thanks for the response, very useful stuff!
Why does it suck less? I'm all for better software, but it would be useful to know how this actually differs from logcheck... performance? usability?
The biggest problem I have had with using logcheck is false negatives and the burden of crafting and deploying new rules to silence those. Can this help? :)