mjl- / mox

modern full-featured open source secure mail server for low-maintenance self-hosted email
https://www.xmox.nl
MIT License
3.71k stars 113 forks source link

On hosts with a single routable interface, quickstart followed by mox serve fails when listening on :80 with "address already in use" #52

Closed kikoreis closed 1 year ago

kikoreis commented 1 year ago

After doing a first mox quickstart and a mox serve I'm seeing a failure when binding to port 80:

l=fatal m="http: listen" err="listen tcp4 0.0.0.0:80: bind: address already in use" pkg=http addr=0.0.0.0:80

A strace shows bind() failing:

bind(34, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EADDRINUSE (Address already in use)

Interestingly, when running this a few times over, the failure sometimes happens when binding to 127.0.0.1, other times when binding to 0.0.0.0.

kikoreis commented 1 year ago

The relevant bind()s to port 80 are here:

bind(18, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
bind(20, {sa_family=AF_INET6, sin6_port=htons(80), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 0
bind(22, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("172.20.0.47")}, 16) = 0
bind(34, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EADDRINUSE (Address already in use)
kikoreis commented 1 year ago

I'm not sure why this is happening; I can bind using nc -l -p 80 just fine, and netstat -apn shows the nc process running and listening on port 80.

kikoreis commented 1 year ago

Ah, playing a bit with the config it probably stems from there being two listeners set up this way:

        internal:

                # Use 0.0.0.0 to listen on all IPv4 and/or :: to listen on all IPv6 addresses, but
                # it is better to explicitly specify the IPs you want to use for email, as mox
                # will make sure outgoing connections will only be made from one of those IPs.
                IPs:
                        - 127.0.0.1
                        - ::1
                        - 172.20.0.47

and

        public:

                # Use 0.0.0.0 to listen on all IPv4 and/or :: to listen on all IPv6 addresses, but
                # it is better to explicitly specify the IPs you want to use for email, as mox
                # will make sure outgoing connections will only be made from one of those IPs.
                IPs:
                        - 0.0.0.0
                        - ::

Is there an expectation that there are multiple non-localhost IPs available for the host to use?

mjl- commented 1 year ago

Could it be there are no remaining IPs for 0.0.0.0 to take? I'll check the behaviour of listening on the unspecified addresses, but I thought it will take and occupy any addresses not explicitly taken by other listeners, but it needs at least one address.

Is there an expectation that there are multiple non-localhost IPs available for the host to use?

No, it isn't needed. The quickstart tries to find the actual public (non-internal) IPs to use for the public listener. But it looks like it couldn't find any in this case. It just adds the unspecified ipv4 and ipv6 addresses in that case. Perhaps not the most helpful, at least not without a warning. Could this be a machine without public IPs? Not having a public IP will complicate things quite a bit.

mjl- commented 1 year ago

Surprisingly, I can't find substantive/clear/conclusive documentation on behavior of listening on the unspecified ipv4 address, but a quick test shows that I can listen on 0.0.0.0 as long as there is a non-127/8 ip available (for the port).

Anyway, I suspect that this is a virtual machine without public IP, or a docker container. If the mail server is for internal use, it's probably enough to move the 172... IP to the public listener. If there should be external access, then in case of docker I would recommend to start with the docker-compose file (see repo) with host networking. In case there is a firewall/load balancer in between the machine and the internet, the setup will be more complicated, and mox won't have access to remote IPs of incoming connections, causing problems with IP-based reputation analysis including rate limiting.

kikoreis commented 1 year ago

It turns out the problem is the generated mox config itself; if I disable the WebserverHTTP and WebserverHTTPS listeners (or specify a different port for them) then the "internal" AdminHTTP and AdminHTTPS handlers can bind correctly to port 80/443 and work. I guess the bug is just that out of the box a ./mox serve fails if you just run quickstart, so perhaps default the internal handlers to a different port?

mjl- commented 1 year ago

I can reproduce the error message. I was under the impression that we could listen on 0.0.0.0:80 when we already had a listener for <specific-ip>:80. But that's not possible, unless we set the SO_REUSEPORT socket option. But that does too much, it allows us to also bind to the specific IP multiple times, and incoming connections would be spread over the listeners. Adding SO_REUSEPORT would taper over the problem.

The reason disabling the WebserveHTTP(s) on the public listener helps is that that prevents binding to 0.0.0.0:80 (with the internal listener already listening on <specific-ip>:80).

I think the main problem to solve is quickstart not finding the correct IP(s) for the public listener. I'm still wondering why it couldn't find your public IP (if there is one). Without one, you'll have trouble with SMTP.

So far, I think the best mox quickstart can do is warn louder that the IPs in the public listener need to be changed to the actual public IPs.

kikoreis commented 1 year ago

I'm running a VM on a public cloud; locally I get a single RFC1918 address. I do get a single floating IP but that's provided via NAT and therefore my instance doesn't see it locally. So there's only really the option of changing ports.

In my config I've moved to using AccountHTTPS and AdminHTTPS on port 8080, and that got me to the point where mox could serve. I set internal IPs to the localhost and local address, and the public IPs to the local address as well. Obviously I can't bind an interface to my public address, so that's the only option I've found so far.

I'm now receiving mail and actually very impressed with how well mox works. The main issue was that it took me a while to figure out why quickstart didn't just work out of the box.

mjl- commented 1 year ago

Clear, thanks!

Have you found the IPsNATed config option on the Listener yet? It could help with the dnscheck warnings. Perhaps it will make more sense if that option just specifies the public IP so we can check against that.

Do the incoming internet connections all appear to come from a NAT gateway, or are remote IPs preserved? Mox stores the remote IP address which each incoming message. If you classify messages as junk, the reputation of those messages is used for future incoming messages (multiple signals, including IP address). If all messages are coming from a single IP, a flurry of junk messages may block legitimate mail too. Mox has an IP-based rate limiter too, meant to block bad actors, but if all connections are from a single IP, all incoming connections may be blocked.

kikoreis commented 1 year ago

Oh, I wasn't aware of that option; I'll enable it for the external listener.

Also, it does appear that it would probably be sensible to have it indicate which IPs are behind a NAT as opposed to applying to all the IPs in a listener (since you might have a mix like I do here).

On this platform remote IPs are preserved, so based on what you say Mox's classification won't be affected. It might make sense to mention this in the documentation but I'm not sure how common it is to run behind a NAT which doesn't preserve remote IPs. Thanks!

kikoreis commented 1 year ago

I enabled the option, but I still see warnings like this:

l=error m="warning: acme tls cert validation for host is likely to fail because not all its ips are being listened on" pkg=autotls hostname=FOO listenedips=[INTERNAL_IP] hostips=[PUBLIC_IP] missingip=PUBLIC_IP

I think your change addresses the original bug; I'll amend the title to be clearer but feel free to close and I'll open another one for the warnings (or allowing one to set a nonbound public IP) if there is something to do there.

mjl- commented 1 year ago

Thanks again, the autotls check indeed didn't take the NAT setting into account. I now added the NATIPs option, to be used instead of IPsNATed, so we can do the DNS checks like we do in case of no NAT.

I'm not sure how common it is to run behind a NAT which doesn't preserve remote IPs

I think this is the case if you try to set up mox in a docker container without host networking. People used to docker for their web apps tend to think they want this for mox too. I added some of that context to the quickstart warning, should prevent folks going down the wrong path.

The quickstart can probably be smarter still about detecting its environment, but this should already help quite a bit.

I'm wondering if it would make sense for your setup to move your 172... IP from the private to the public listener. I typically access the private listener through a VPN or SSH forward. Having the IP only in public listener would prevent the "address in use" error. If you want to access the admin/account endpoints directly from the internet, you could choose to enable them (only) on the public listener.

Thanks for the report and feedback, very helpful!

lormayna commented 9 months ago

I have a similar problem with a dual IPv4/IPv6 server:

Feb 08 12:10:57 localhost mox[23565]: l=error m="warning: acme tls cert validation for host is likely to fail because not all its ips are being listened on" pkg=mox hostname=autoconfig.mydomain.tld listenedips=[$IPv4] hostips=[$IPv6;$IPv4] missingip=$IPv6

mjl- commented 9 months ago

@lormayna this should mean that autoconfig.<yourdomain> resolves to both an ipv4 and an ipv6 address, and that you're listening on the ipv4 address, but not on the ipv6 address.

are you ensuring in some other way that the ipv6 requests make it to your ipv4 address?

if not, this can indeed cause trouble: when let's encrypt tries to verify the challenge when a certificate is requested by making a connection to your server.

if ipv6 requests do make it to your mox instance, and you have NAT configured, your public ipv6 address should also be in the NATIPs config field, see https://www.xmox.nl/config/#cfg-mox-conf-Listeners-x-NATIPs.

lormayna commented 9 months ago

You are right, I just forgot to uncomment the config for IPv6 :) Thank you

Il giorno gio 8 feb 2024 alle ore 13:41 Mechiel Lukkien < @.***> ha scritto:

@lormayna https://github.com/lormayna this should mean that autoconfig. resolves to both an ipv4 and an ipv6 address, and that you're listening on the ipv4 address, but not on the ipv6 address.

are you ensuring in some other way that the ipv6 requests make it to your ipv4 address?

if not, this can indeed cause trouble: when let's encrypt tries to verify the challenge when a certificate is requested by making a connection to your server.

if ipv6 requests do make it to your mox instance, and you have NAT configured, your public ipv6 address should also be in the NATIPs config field, see https://www.xmox.nl/config/#cfg-mox-conf-Listeners-x-NATIPs.

— Reply to this email directly, view it on GitHub https://github.com/mjl-/mox/issues/52#issuecomment-1934034970, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABMUJCY5VR2K76QPGWKXY3LYSTBWFAVCNFSM6AAAAAA3J5TLJKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZUGAZTIOJXGA . You are receiving this because you were mentioned.Message ID: @.***>

-- LORENZO MAINARDI