emersion / go-msgauth

🔏 A Go library and tools for DKIM, DMARC and Authentication-Results
MIT License
162 stars 51 forks source link

DKIM check not working for CNAME key #59

Closed shroff closed 7 months ago

shroff commented 8 months ago

First off, thanks for the great work on this library!

Second, this issue is not directly related to go-msgauth, but I was wondering if you could suggest a workaround. Also, I'm using this library through foxcpp/maddy, and not directly.

So here goes: I use namecheap for my DNS needs, but their emails fail to be delivered with the following line in my logs: dkim: key unavailable: lookup s1._domainkey.namecheap.com. Upon closer examination, it looks like s1._domainkey.namecheap.com is not a TXT record, but a CNAME that resolves to s1.domainkey.u1828068.wl069.sendgrid.net.

What's more is that while nslookup (provided by dnsutils) on my (debian bookworm) server does not resolve CNAME records when looking up TXT records, the nslookup (provided by bind) on my (arch) desktop does so without any complaints.

Here's the output of both:

❯ nslookup -type=txt s1._domainkey.namecheap.com
;; Truncated, retrying in TCP mode.
Server:     1.1.1.1
Address:    1.1.1.1#53

Non-authoritative answer:
s1._domainkey.namecheap.com canonical name = s1.domainkey.u1828068.wl069.sendgrid.net.
s1.domainkey.u1828068.wl069.sendgrid.net    text = "k=rsa; t=s; p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA4EJ2WbK3G12fhP8hlHBTABlvdbKePJXwux+sjGXRnnoVdGAaw9q9D96qeW3uWqAbBSyPB06w4zTeK1qi7Ar+rBC91zKEiuoi6Rbd8xkDBG1Emo8RMhZjOHer5xl0TobynvYy6J4F/ge4OgA17nNDfc7n2Xg+OOKHVY4dVZfdgNR29eGraxD8X0E2pMBdNgtqKvt6S" "4irlnEuhvko+Ls3XqBicTnM30QO4ffyIJWlUqHEwVjBUHKXV+/sTif8UecWw2m9uLYlPbeNBAjMcRtmKYC+tKT39laA2mtPuQub9LHtgzkmAXqE9D7uvgc8gEoUgdvQyefKClRR/rKomB9CeQIDAQAB"

vs.

maddy@frodo:~$ nslookup -type=txt s1._domainkey.namecheap.com
;; Truncated, retrying in TCP mode.
;; Connection to xxx#53(xxx) for s1._domainkey.namecheap.com failed: timed out.
;; Connection to xxx#53(xxx) for s1._domainkey.namecheap.com failed: timed out.
;; Connection to xxx#53(xxx) for s1._domainkey.namecheap.com failed: timed out.
;; Connection to xxx#53(xxx) for s1._domainkey.namecheap.com failed: timed out.
;; Connection to xxx#53(xxx) for s1._domainkey.namecheap.com failed: timed out.

It also looks like the SendGrid key doesn't begin with v=DKIM1, which the code explicitly ignores, but that's an issue for another day. I'll try reaching out to them to see if they will update their keys to conform to the RFC, but I don't have high hopes for that.

emersion commented 8 months ago

Hm, I don't see anywhere in the RFC where CNAME lookups are mentioned.

shroff commented 8 months ago

True, I was surprised when I saw that too, but it looks like at least MailChimp and SendGrid give instructions for setting up DKIM using CNAME.

And it looks like there is no consensus between the different implementations of dns resolution providers (libdns, libbind9, or whatever else is causing the difference on the two different setups).

The question is - how do you weigh sticking to the spec vs what is happening in practice?

AGWA commented 8 months ago

The problem here is not the CNAME - nslookup -type=txt s1._domainkey.namecheap.com on my Debian Bookworm system returns the TXT record, as does net.LookupTXT("s1._domainkey.namecheap.com"). The DKIM RFC does not need to mention CNAMEs, because RFC 1034 already says that a TXT lookup will follow CNAMEs.

The problem is that your server is having some trouble contacting namecheap.com's DNS servers:

;; Connection to xxx#53(xxx) for s1._domainkey.namecheap.com failed: timed out.
;; Connection to xxx#53(xxx) for s1._domainkey.namecheap.com failed: timed out.
;; Connection to xxx#53(xxx) for s1._domainkey.namecheap.com failed: timed out.
;; Connection to xxx#53(xxx) for s1._domainkey.namecheap.com failed: timed out.
;; Connection to xxx#53(xxx) for s1._domainkey.namecheap.com failed: timed out.
shroff commented 8 months ago

That's not the issue

maddy@frodo:~$ nslookup -type=cname s1._domainkey.namecheap.com
Server:     xxx
Address:    xxx#53

Non-authoritative answer:
s1._domainkey.namecheap.com canonical name = s1.domainkey.u1828068.wl069.sendgrid.net.

Authoritative answers can be found from:
namecheap.com   nameserver = edns1.registrar-servers.com.
namecheap.com   nameserver = edns2.registrar-servers.com.
namecheap.com   nameserver = edns4.ultradns.biz.
namecheap.com   nameserver = edns4.ultradns.com.
namecheap.com   nameserver = edns4.ultradns.net.
namecheap.com   nameserver = edns4.ultradns.org.
emersion commented 8 months ago

What is the exact error returned by go-msgauth?

AGWA commented 8 months ago

Then the problem is likely contacting sendgrid.net's servers. Try nslookup -type=txt s1.domainkey.u1828068.wl069.sendgrid.net

shroff commented 8 months ago

key unavailable

shroff commented 8 months ago

Then the problem is likely contacting sendgrid.net's servers. Try nslookup -type=txt s1.domainkey.u1828068.wl069.sendgrid.net

Already tried that

maddy@frodo:~$ nslookup -type=txt s1.domainkey.u1828068.wl069.sendgrid.net.
Server:     xxx
Address:    xxx#53

Non-authoritative answer:
s1.domainkey.u1828068.wl069.sendgrid.net    text = "k=rsa; t=s; p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA4EJ2WbK3G12fhP8hlHBTABlvdbKePJXwux+sjGXRnnoVdGAaw9q9D96qeW3uWqAbBSyPB06w4zTeK1qi7Ar+rBC91zKEiuoi6Rbd8xkDBG1Emo8RMhZjOHer5xl0TobynvYy6J4F/ge4OgA17nNDfc7n2Xg+OOKHVY4dVZfdgNR29eGraxD8X0E2pMBdNgtqKvt6S" "4irlnEuhvko+Ls3XqBicTnM30QO4ffyIJWlUqHEwVjBUHKXV+/sTif8UecWw2m9uLYlPbeNBAjMcRtmKYC+tKT39laA2mtPuQub9LHtgzkmAXqE9D7uvgc8gEoUgdvQyefKClRR/rKomB9CeQIDAQAB"
AGWA commented 8 months ago

Well, you're clearly having trouble contacting something as indicated by the "timed out" errors from nslookup, but it's hard to know because you're redacting the error messages. You still haven't provided the full error message from go-msgauth. It should look something like:

dkim: key unavailable: lookup s1._domainkey.namecheap.com on 169.254.169.254:53: dial udp 169.254.169.254:53: connect: no route to host
shroff commented 8 months ago
key unavailable: lookup s1._domainkey.namecheap.com on [2a01:4ff:ff00::add:2]:53: read udp [fd00::249f:bfff:fe7a:72e6]:48573-\u003e[2a01:4ff:ff00::add:2]:53: i/o timeout

Already tried the following:

maddy@frodo:~$ nslookup -type=cname s1._domainkey.namecheap.com 2a01:4ff:ff00::add:2
Server:     2a01:4ff:ff00::add:2
Address:    2a01:4ff:ff00::add:2#53

Non-authoritative answer:
s1._domainkey.namecheap.com canonical name = s1.domainkey.u1828068.wl069.sendgrid.net.

Authoritative answers can be found from:
namecheap.com   nameserver = edns4.ultradns.biz.
namecheap.com   nameserver = edns4.ultradns.com.
namecheap.com   nameserver = edns4.ultradns.net.
namecheap.com   nameserver = edns4.ultradns.org.
namecheap.com   nameserver = edns1.registrar-servers.com.
namecheap.com   nameserver = edns2.registrar-servers.com.

maddy@frodo:~$ nslookup -type=txt s1._domainkey.namecheap.com 2a01:4ff:ff00::add:2
;; communications error to 2a01:4ff:ff00::add:2#53: timed out
;; communications error to 2a01:4ff:ff00::add:2#53: timed out
;; communications error to 2a01:4ff:ff00::add:2#53: timed out
;; no servers could be reached
emersion commented 8 months ago

This sounds like an issue with the DNS server, not with go-msgauth. The following program fails on the Go playground, but succeeds locally for me: https://go.dev/play/p/MGIIpQS_oix

shroff commented 7 months ago

I tried a similar example locally and on the server and got the same result as you - local success and remote failure. Looks like the problem is with the local resolver and not the DNS server, since using 1.1.1.1 as the nslookup server also fails.

You're right that this issue isn't related to go-msgauth. I mentioned this in the issue description, but I was hoping to try figure out what exactly is going on because this is quite a strange and unexpected issue.

Anyway, thanks for your help, and for yours @AGWA. I'll try some more things and report if I have any success.

shroff commented 7 months ago

Okay, got it.

;; Truncated, retrying in TCP mode.

My nftables config is pretty conservative, and has a whitelist of outbound tcp ports which does did not include 53.