Closed flimzy closed 10 years ago
Our current implementation uses two database tables. One table is static (it could just as well be statically defined in perl or defined in a config file) and defines how filtering results should adjust the banner delay up or down. For example, if we accept a message from a sending server as clean, we adjust the banner delay down by 3 seconds; if we reject a message as spam, we adjust the delay up by 5 seconds; if we reject it due to a non-existent recipient address, we increase the delay by 1 second. The second table simply tracks sending server IP addresses and the banner delay they should currently receive. This could be stored in memory or with Cache::FastMmap.
Note that the biggest advantages we've seen from this methodology have nothing to do with actually rejecting "early talkers", but more to do with rate limiting. We set a per-IP and per-class C concurrent limit of 1 for any server that has a banner delay of 15 seconds or higher. The default is 15 seconds, so in other words, you have to send some legit mail through us before we will allow you or your /24 any concurrency whatsoever. Using this method results in a whole bunch of 'max concurrency' deferrals with very few false positives (and when FPs do happen, they are short lived enough that we never hear about it)
A lot of rate limiting also occurs due to spammer mail clients that give up on the banner delay and disconnect before we ever get around to sending the greeting. This happens a ton, and when you track it you can see that as we continue to increase the delay, more and more illegitimate connecting clients stop attempting to send through QP; especially as we hit rounded off milestones, e.g. 20 seconds, 30 seconds, 45 seconds, 60 seconds.
I do like this idea and I do something very similar within the karma plugin. I implemented karma's reputation database in AnyDBM but a better choice would be a key-value store such as Redis or MongoDB. Preferably Redis. The reputation database is a "hot" lookup that gets hit early on every connection. The last time I tested Redis I was hitting 90,000 qps.
PS: If your reputation information was moved into a Redis store, I would almost certainly update the karma plugin to share it.
PPS: If the Redis schema was the same as the one used by my karma plugin in Haraka, that'd be even more delicious.
For now, the reputation database and the setting of the connection note to control the wait length is ruled "out of scope" for this issue.
How is reputation determined?