ScottPeterJohnson / purelymail-issues

Issues repository for the Purelymail email service.
32 stars 0 forks source link

SpamAssassin: False Negatives due to RCVD_IN_DNSWL_HI #78

Open ScottPeterJohnson opened 2 years ago

ScottPeterJohnson commented 2 years ago

(This issue was imported from Gitea) stephan on August 10, 2021: Unfortunately, spam increasingly makes it through the SpamAssassin filters and ends up in my inbox.

I think, I might have found one possible reason.

Many of the spam mails contain RCVD_IN_DNSWL_HI in X-Spam-Status. As I understand it, RCVD_IN_DNSWL_HI is the https://www.dnswl.org/ white-list, that should mark known "good" senders and in turn provide negative (=less spammy) points for SpamAssassin. (The default would probably be -5.0, but I don't know what's configured for Purelymail.)

However, dnswl.org will apparently give out false information ("this is a good sender", even if it's not) to some very high volume users to get them to notice and stop querying above the limits.

See: https://www.mail-archive.com/users@spamassassin.apache.org/msg108274.html

(Also: https://www.linode.com/community/questions/21413/rcvd_in_dnswl_hi-false-positives, https://wmbuck.net/blog/?p=1191)

I think, Purelymail is using the Cloudflare DNS resolvers. As those are used by millions of people, it seems plausible that they (or at least some of their IPs) are "blocked" by dnswl.org.

A few example IPs (from the X-Pm-Spam-Purelymail-Ip-Reputation header), that have been marked with RCVD_IN_DNSWL_HI but are not actually white-listed according to https://www.dnswl.org/?page_id=72:

91.103.252.72 193.162.143.181 91.103.252.72 5.199.130.206 91.103.252.174

ScottPeterJohnson commented 2 years ago

Comment by Scott on August 10, 2021: I've noticed the DNSWL_HI problem, and have actually been investigating it. As far as I know, DNSWL doesn't provide false information anymore, neither do we yet qualify as a high-volume sender. Our DNS isn't Cloudflare; it's the default AmazonDNS.

My best theory at the moment is that if it's not an issue with DNSWL itself, it seems like it might be a DNS cache poisoning attack, since both DNS and DNSWL seem fundamentally vulnerable to it.

I might end up just disabling DNSWL rules if so.

ScottPeterJohnson commented 2 years ago

Comment by Scott on August 11, 2021: Uh, apparently I'm wrong, according to this recent mailing list entry (whose site is inconveniently down but is in Google's cache), and they do still send false information?

That's legitimately fucking awful of them if true. Like, they have the URIBL_BLOCKED return code for just this situation, and they don't appear to give it to us. I'm disabling the rule, so let's find out.

ScottPeterJohnson commented 2 years ago

Comment by stephan on August 11, 2021: Hi Scott,

thanks a lot for looking into this so quickly.

Yeah, I think, they're still doing this :-(

When checking some unlisted IPs (e.g. 91.103.252.72 -> 72.252.103.91.list.dnswl.org) on DNS propagation test sites they will sometimes return codes like 127.0.10.3 (category: some special cases - haha) instead of 127.0.0.255 (above limit).

That appears to be happening to some of the larger resolvers (e.g. resolver1.opendns.com). Seems quite random though. Lots of timeouts and SERVFAILs as well...

Thank you very much for disabling the rule!

Ah, just to explain, why I thought Purelymail was using the Cloudflare resolvers. To test, I sent some mails to bounce@1.[random-number].bash.ws to trigger an MX record lookup from Purelymail. Then, using this API: https://bash.ws/dnsleak/test/[random-number]?txt checking from where the DNS lookups have been made. (Example: https://bash.ws/dnsleak/test/1234567?txt) That gives only CloudFlare ASN IPs. But of course the SMTP delivery DNS can be - and apparently is ;-) - different from what SpamAssassin is using.