sdgathman / pyspf

Other
49 stars 26 forks source link

Add special case for +all SPF ? #18

Closed gpatel-fr closed 2 years ago

gpatel-fr commented 4 years ago

Greetings

This is a suggestion/question, not a bug report.

I just wonder if pyspf could not add some special (optional) handling for SPF records like this:

v=spf1 +a +mx +a:vpr60.prodius.be +all

Basically it would just ignore them.

this is the current behaviour,

(mypy3) gerard@q1900:~/pyspf$ python spf.py "v=spf1 +a +mx +a:vpr60.prodius.be +all" 70.184.247.44 billsegat@microsoft.com mail.gathman.org result: ('pass', 250, 'sender SPF authorized') +all

A formally correct result, but a bit of a let down in the spirit of the standard (it's supposed to fight spam, right ? this SPF record is one among many other like that I have seen in my logs)

So the changed output could be - with an additional command switch (corresponding to a class parameter / parameter to check/check2 global functions):

(mypy3) gerard@q1900:~/pyspf$ python spf.py -k "v=spf1 +a +mx +a:vpr60.prodius.be +all" 70.184.247.44 billsegat@microsoft.com mail.gathman.org result: ('fail', 550, 'SPF fail - not authorized') -all](url)

I could do a PR along these lines, but before doing it I prefer to know if it would be rejected out of hand as not standard conformant.

The default behaviour would stay of course standard conformant.

I'm well aware that this could be handled at the mail server level, but it would be easier to do it at the library level.

sdgathman commented 4 years ago

First, there is no problem with options to extend standard conforming behavior. The library already has support for heuristics to e.g. interpret ip:1.2.3.4 as ip4:1.2.3.4. However, all these extensions return two results: the official standard conforming result (PermErr in the case of ip:1.2.3.4) and the heuristic result (called "best guess"). So in your case, the standard result would still be Pass, but you could return a heuristic result of:

or whatever.

The next issue is that simply checking for +all is no good. You can have legit policies that end in +all, and exclude most of the ip space with '-' policies. (And I've seen them.) Now your example implies a heuristic of checking that all mechanisms are '+' - I think that is correct. Ideally, you want to compute an ipset, and check that less that some percentage of ip space is accepted - but that might be too complex. (The libspf2 C library can actually "compile" policies into a byte code that can be cached and evaluated faster - especially by eliminating DNS lookups. The cached bytecode has an expiration based on DNS lifetimes.)

We do already import ipaddress, which includes ipset - but computing the ipset requires evaluating every mechanism. I think. That is certainly faster that evaluation for every possible IP :-), but maybe there is a tricky algorithm I'm not thinking of.

Note that the CLI has an undocumented option to compute an iplist: substitute "list" for the ip address. This implementation is incomplete, I added it as a tool to help debug simple policies. In particular, it doesn't handle the effect of '-' mechanisms. This is tricky because of the short circuiting. It also doesn't handle the all mechanism (since simple policies just end in -all). One project would be to make the iplist implementation more complete, and see if it could be reasonably used to compute an ip space coverage. But, many policies with macros can't be reduced to an ipset without evaluating for every ip. E.g. looking up a DNS record with a name constructed from the IP. So you would still have to recognize such policies and punt (which is what libspf2 does).

So the "all + mechanisms" heuristic is probably simple and effective enough for the present.

sdgathman commented 4 years ago

it's supposed to fight spam, right?

Actually, no. SPF is supposed to prevent MAIL FROM and HELO forgery. That is why US banks adopted it. Spammers generally have perfectly correct SPF records - for a throwaway domain. So when you see a stupid record like you describe - it is much more likely a clueless mail admin than a spammer. So having a PermErr heuristic and maybe sending a DSN saying "your Sender Policy is totally useless - did you make a mistake?" is probably the proper response.

gpatel-fr commented 4 years ago

(...)

The next issue is that simply checking for +all is no good. (... skip explanation on how horribly difficult would be to implement correctly IP address space calculation) So the "all + mechanisms" heuristic is probably simple and effective enough for the present.

Thanks for taking the time to reply ... but sorry but I have a bit of trouble parsing your answer. Checking for +all is simple indeed, that's a project like rspamd is doing for example, with utter brutality (and I wonder if people deliberately pissing in the pool deserve more to be candid); is something like that acceptable or not ? As I'm sure you are aware, hijacking sites with dumb policies located in well managed registries is an effective strategy for spammers; I received a spam one week ago pretending to be from a 14 year old domain in my country (and certainly not coming really from it) having the marvellous policy of v=spf1 +all, and the domain is still not found in dbl.spamhaus.org, one week of spamming (at least) is already very good for a spammer.

sdgathman commented 4 years ago

Checking for a policy ending in +all is not acceptable. Fails can happen with earlier '-' mechanisms in a legit policy. What I think would work is checking for all + mechanisms ending in +all (the default result is Neutral). E.g., "v=spf1 -include:spf_reject.example.com +all" cannot be rejected by your heuristic. But on the other hand "v=spf1 +include:spf_reject.example.com +all" will certainly pass all IPs.

sdgathman commented 4 years ago

Here is another example. "v=spf1 a -ip6:2001:://3 -ip4:0.0.0.0/0 +all" mail domain IP gets a pass, and only private IP6 ips get a pass (i.e. internal sources). Just driving home that +all at the end does NOT by itself indicate a null policy.

kitterma commented 4 years ago

There are so many ways to write an SPF record that will pass all mail, I don't think there's any value in adding special processing for any one specific record. The best case is it annoys enough people they write a slightly different one and then you get to start over.

Also, not all domains with v=spf1 +all are abuse sources. It's also used by people who have been told they "have" to do SPF, but don't want to deal with it. If a domain wants to accept responsibility for mail from the domain sent from anywhere on the Internet, then I think they should get to do that.

gpatel-fr commented 4 years ago

Checking for a policy ending in +all is not acceptable. Fails can happen with earlier '-' mechanisms in a legit policy. What I think would work is checking for all + mechanisms ending in +all (the default result is Neutral). E.g., "v=spf1 -include:spf_reject.example.com +all" cannot be rejected by your heuristic. But on the other hand "v=spf1 +include:spf_reject.example.com +all" will certainly pass all IPs.

Right, I see your point. I will try to understand better the implementation before thinking on a good compromise.

gpatel-fr commented 4 years ago

There are so many ways to write an SPF record that will pass all mail, I don't think there's any value in adding special processing for any one specific record. The best case is it annoys enough people they write a slightly different one and then you get to start over.

the goal is not to fight people genuinely wanting to help spammers, a blacklist signaling is the best way for that (or even complaining to their ISP). The problem is lazy admins. If it's less work to write a correct SPF record they will probably do it.

Also, not all domains with v=spf1 +all are abuse sources. It's also used by people who have been told they "have" to do SPF, but don't want to deal with it. If a domain wants to accept responsibility for mail from the domain sent from anywhere on the Internet, then I think they should get to do that.

it's exactly as acceptable as setting up an open relay.

Daniel-Abrecht commented 2 years ago

I've opened a pull request to just treat +all like ?all: https://github.com/sdgathman/pyspf/pull/35 My rational for this is in the pull request & commit message.

kitterma commented 2 years ago

I think it's up to the SPF library to do SPF as defined by RFC (4408|7208). Any exceptions to that should be done in a way that doesn't automatically change results to a non-standard result without some kind of opt-in. Even if something like this were a good idea (I still don't), this is definitely not the way to do it.

Daniel-Abrecht commented 2 years ago

So to get this, I should propose an update to those RFCs or make a new RFC first, and hope it gets accepted?

sdgathman commented 2 years ago

That would be like mandating logical positivism. (Only constructive proofs allowed.) If you want to treat policies with +all specially, note that pyspf already provides the mechanism that matched.

   if q.mechanism == 'all' and q.result == 'pass':
      q.best_guess = 'neutral'   # set a non-RFC result
Daniel-Abrecht commented 2 years ago

Yes. SPF is there to specify who is authorized to send a mail. For an unknown sender, by definition, you can not know this. And I don't want to have to deal with such broken setups spamming my inbox.

Daniel-Abrecht commented 2 years ago

Currently, I just patch pyspf on my server locally to do what I want. I use postfix-policyd-spf-python on my mail server, which uses pysqf. I'm not aware of any way to configure it to treat +all specially, and the code is a bit convoluted. Patching pyspf is just way easier, and more people can benefit from it.

sdgathman commented 2 years ago

I would make a lot more sense to patch postfix-policyd-spf-python, with something like the example code. The bestguess property is a non-standard result for use by local policy. You should NEVER change Received-SPF or AuthenticationResults to report a non-standard result as the standard result, except by an extension like bestguess=or localres=. Also, by patching pyspf to be non-conforming, you could break other apps on the same box. (But in a container, it might not matter.)

sdgathman commented 2 years ago

Note that I have yet to encounter spam using +all (I understand the concern is about potential forgery, not spam). All spam is either Windows zombie stuff with invalid HELO, or has perfectly formed SPF with a throw-away domain.

sdgathman commented 2 years ago

Yes. SPF is there to specify who is authorized to send a mail. For an unknown sender, by definition, you can not know this. And I don't want to have to deal with such broken setups spamming my inbox.

This is not true. IP is not the only method of authorization.