mjl- / mox

modern full-featured open source secure mail server for low-maintenance self-hosted email
https://www.xmox.nl
MIT License
3.35k stars 88 forks source link

spf checks fail for mails send between users managed by mox and mox is behind NAT #159

Open lmeunier opened 2 months ago

lmeunier commented 2 months ago

With mox with a NATed IP address behind a router, when a user sends an email to an other user (both users on the same domain, and the domain is managed by mox), the SPF check always fails because the incoming connection is coming from the router internal IP address.

I guess that mox is connecting to the router public IP address (the IP address corresponding to the domain MX record), and the router translate the connection to the mox server. But, from the mox point of view, the connection is coming from the router internal IP address instead of the public IP address. This result in SPF check failure.

Apr 27 12:40:01 nixos-rpi mox[6610]: l=debug m="spf verify result" pkg=smtpserver pkg=spf domain=example.com ip=192.168.1.254 status=softfail explanation= duration=42.611123ms cid=18f1ecd1c88 delta="103.333µs"

The behavior is different for SMTP connections coming from Internet, in this case mox sees the real IP address of the remote connecting server and SPF checks are ok.

So my question is: is it normal that mox is connecting to itself when sending an email to a domain managed by mox? I was expecting mox to skip the smtp part (and therefore skip spf/dkim checks) and directly deliver the email to the recipiant mailbox.

mjl- commented 2 months ago

Yes, mox delivers messages to domains it hosts through smtp like messages to any other domain.

The idea is that the delivery paths are the same, all messages can expect the same treatment. The queue code has functionality like delayed delivery. The smtp server has code for rate limiting, and the reputation analysis. That should also be applied for messages between users of the same mox instance (don't want to accidentally fill up someone else mailbox, and you may want to reject messages from another local user too (e.g. consider an mox instance that hosts several, unrelated, domain names)). Delivering messages without going through smtp also has a hypothetical risk of delivering to a domain that is configured in mox but for which no MX records are active. So it makes sense to me to keep delivering over smtp.

To solve this problem with SPF, perhaps we can add a configuration option to mox.conf to compensate. It could specify the router internal IP and the router public IP as a replacement. If the connection IP we are about to SPF-check matches the router internal IP, we would replace it with the router public IP before calling the SPF check code. The router public IP should be in the domain's SPF record, so the check would pass. The SPF check would still fail if the replacement router public IP is not allowed by the domain SPF record. I think this approach stays close to the normal expected protocol behaviour.

lmeunier commented 2 months ago

The idea is that the delivery paths are the same, all messages can expect the same treatment. The queue code has functionality like delayed delivery. The smtp server has code for rate limiting, and the reputation analysis. That should also be applied for messages between users of the same mox instance (don't want to accidentally fill up someone else mailbox, and you may want to reject messages from another local user too (e.g. consider an mox instance that hosts several, unrelated, domain names)). Delivering messages without going through smtp also has a hypothetical risk of delivering to a domain that is configured in mox but for which no MX records are active. So it makes sense to me to keep delivering over smtp.

Yes, this totally makes sense. Forget my expectations and let's keep this behaviour. :)

To solve this problem with SPF, perhaps we can add a configuration option to mox.conf to compensate. It could specify the router internal IP and the router public IP as a replacement. If the connection IP we are about to SPF-check matches the router internal IP, we would replace it with the router public IP before calling the SPF check code. The router public IP should be in the domain's SPF record, so the check would pass. The SPF check would still fail if the replacement router public IP is not allowed by the domain SPF record. I think this approach stays close to the normal expected protocol behaviour.

My use case is maybe not so common (having mox behind NAT and a router that translates internal connections with internal IP address). If your are OK, let's do nothing for now and we'll see if others have the same issue. If so, maybe we can add a configuration option to mox.conf. I found a workaround by defining a custom route with a ToDomain listing domains managed by mox, this route is configured to only use IPv6. In my case, the MX record point a DNS entry that have both IPv6 and IPv4. The IPv4 is the router public IP address, and the IPv6 is directly attached to the mox server. So using only IPv6 makes the spf check happy.

If it's ok for you, we can close this issue.

mjl- commented 2 months ago

I've heard of this issue before. If mox is behind a NAT, there seems to be a good chance it works like this (though I don't know exactly how common it is). Perhaps a config change in routers can change its behaviour for these connections? NAT hairpinning is described as translating the NAT source IP to that of the public IP of the router for these connections, see https://en.wikipedia.org/wiki/Network_address_translation#NAT_hairpinning.

For people behind a NAT, without the option to switch to IPv6, a config option to specify replacement IPs for SPF could still be useful. So I also wouldn't mind keeping this open and implementing at some point in the future (not urgent). The implementation wouldn't be hard. Though it's additional code to maintain, better if it weren't needed...