trusteddomainproject / OpenDKIM

Other
97 stars 52 forks source link

OpenDKIM handles Authentication-Results parsing badly (no CFWS handling), and thus ignores valid auth headers. #186

Open MrPeteH opened 1 year ago

MrPeteH commented 1 year ago

(NOTE: this is similar to OpenARC issue # 163 )

Various RFC-compliant DKIM and ARC softwares produce headers that OpenDKIM does not understand. I've tracked down the issue to lack of RFC compliance in OpenDKIM header parsing. This issue report provides a few sample headers that OpenDKIM doesn't like, then describes the challenge in some specificity.

SUMMARY OF TWO KEY PARSE ISSUES 1) OpenDKIM doesn't parse CFWS whitespace properly (Folding White Space and Comments.) This ABNF is ubiquitous in key headers. 2) OpenDKIM doesn't parse AuthServe-ID as dot-atom - appears to assume it is FQDN?

SAMPLES CAUSING TROUBLE (ALL are RFC-compliant headers) 1) Produced by OpenDKIM on my own server. (Problem: the comment is standard CFWS and breaks parsing)

    dkim=pass (1024-bit key; unprotected) header.d=their.dom.ain header.i=@their.dom.ain header.a=rsa-sha256 header.s=1000073432 header.b=eKmreZ4p;
    dkim-atps=neutral

2) Produced by DKIM on a popular hosting service (not sure what SW they use) (Problem: the AuthServe-ID is dot-atom)

    rspamd-786cb55f77-65p7t;
    auth=pass smtp.auth=sample-host smtp.mailfrom=user@dom.ain

etc.

DISCUSSION OF THE ISSUE AFAIK (from experience -- see above -- and code examination), OpenDKIM parsing is not RFC-compatible with CFWS. What's that?

Authentication-Results header definition ABNF make many references to RFC 5322 CFWS -- Folding White Space and Comments -- which is essentially folded white space plus any amount of [ \t\n] white space plus optional, nestable, () comments which can contain any text at all other than "(", ")" or "\".

Here are ABNF references in the RFC's:

As a dev, that sounds complex to me. I searched for some help towards a functioning implementation. To get us started, here is a regexp definition for CFWS, extracted from http://www.watersprings.org/pub/id/draft-seantek-mail-regexen-01.html#rfc.section.3.2

(?(DEFINE)
 (?<FWS>(?:[\t ]*\r\n)?[\t ]+)
 (?<CFWS>(?:(?&FWS)?(?&comment))+(?&FWS)?|(?&FWS))
 (?<ctext>[!-'*-\[\]-~])
 (?<ccontent>(?&ctext)|(?&quoted_pair)|(?&comment))
 (?<comment>\((?:(?&FWS)?(?&ccontent))*(?&FWS)?\))
 (?<quoted_pair>\\[ -~])
)

NOTE: The equivalent issue is found in OpenARC. Solve one and we can solve them both.

futatuki commented 9 months ago

It seems that the regular expression(PCRE) above is not enough if we want to parse CFWS as RFC5322 says, because

futatuki commented 8 months ago

As far as I read opendkim/opendkim-ar.c, it seems it can handle comments correctly as most outside comment token and dropping white spaces in ares_tokenize() , and then ares_parse() skips comment tokens.

Actually I built opendkim-ar test program from develop branch after ./configure , by do cd opendkim; cc -I.. -I../libopendkim -O0 -g -DARTEST=1 -pipe -o opendkim-ar opendkim-ar.c, then:

$ # sample 1 in above.
$ ./opendkim-ar 'foo ; dkim=pass (1024-bit key; unprotected) header.d=their.dom.ain header.i=@their.dom.ain header.a=rsa-sha256 header.s=1000073432 header.b=eKmreZ4p;
 dkim-atps=neutral'
token 0 = 'foo'
token 1 = ';'
token 2 = 'dkim'
token 3 = '='
token 4 = 'pass'
token 5 = '(1024-bit key; unprotected)'
token 6 = 'header'
token 7 = '.'
token 8 = 'd'
token 9 = '='
token 10 = 'their'
token 11 = '.'
token 12 = 'dom'
token 13 = '.'
token 14 = 'ain'
token 15 = 'header'
token 16 = '.'
token 17 = 'i'
token 18 = '='
token 19 = '@their'
token 20 = '.'
token 21 = 'dom'
token 22 = '.'
token 23 = 'ain'
token 24 = 'header'
token 25 = '.'
token 26 = 'a'
token 27 = '='
token 28 = 'rsa-sha256'
token 29 = 'header'
token 30 = '.'
token 31 = 's'
token 32 = '='
token 33 = '1000073432'
token 34 = 'header'
token 35 = '.'
token 36 = 'b'
token 37 = '='
token 38 = 'eKmreZ4p'
token 39 = ';'
token 40 = 'dkim-atps'
token 41 = '='
token 42 = 'neutral'

2 results found
authserv-id 'foo'
version ''
result #0, 5 properties
        method "dkim"
        result "pass"
        reason ""
        property #0
                ptype "header"
                property "d"
                value "their.dom.ain"
        property #1
                ptype "header"
                property "i"
                value "@their.dom.ain"
        property #2
                ptype "header"
                property "a"
                value "rsa-sha256"
        property #3
                ptype "header"
                property "s"
                value "1000073432"
        property #4
                ptype "header"
                property "b"
                value "eKmreZ4p"
result #1, 0 properties
        method "dkim-atps"
        result "neutral"
        reason ""
$ # sample 2 in above
$ ./opendkim-ar 'rspamd-786cb55f77-65p7t;
>  auth=pass smtp.auth=sample-host smtp.mailfrom=user@dom.ain'
token 0 = 'rspamd-786cb55f77-65p7t'
token 1 = ';'
token 2 = 'auth'
token 3 = '='
token 4 = 'pass'
token 5 = 'smtp'
token 6 = '.'
token 7 = 'auth'
token 8 = '='
token 9 = 'sample-host'
token 10 = 'smtp'
token 11 = '.'
token 12 = 'mailfrom'
token 13 = '='
token 14 = 'user@dom'
token 15 = '.'
token 16 = 'ain'
1 result found
authserv-id 'rspamd-786cb55f77-65p7t'
version ''
result #0, 2 properties
        method "auth"
        result "pass"
        reason ""
        property #0
                ptype "smtp"
                property "auth"
                value "sample-host"
        property #1
                ptype "smtp"
                property "mailfrom"
                value "user@dom.ain"

Both of samples could be parsed correctly.

As there is no difference in ares_parse() among OpenDKIM, OpenDMARC, and OpenARC, there is also no issue on parsing CFWS in AR header on OpenDMARC and OpenARC, I think.

MrPeteH commented 8 months ago

@futatuki good catch on the OBS syntax! And thank you for your work to verify. I would be happy to invest some rare round-tuit time to duplicate your effort, and apply it to the failing examples I've seen.