mjl- / mox

modern full-featured open source secure mail server for low-maintenance self-hosted email
https://www.xmox.nl
MIT License
3.71k stars 113 forks source link

few issues, mostly ipv6 #88

Closed zkazsi closed 8 months ago

zkazsi commented 1 year ago

Once again, probably more of a discussion thread from me

ipv6 issues in domain check

In the meantime I've activated ipv6 on my VPS. Since then , currently, I have following errors on the Domain-> Check DNS page:

TLS

`SMTP connection with STARTTLS to MX hostname "mail.example.com." IP 2a[redacted]: dial tcp [2a[redacted]]:25: i/o timeout

Having learned from our previous discussion, I've checked connectivity using "openssl s_client -connect" , to the ipv6 IP, both to port 465 and to port 25

Despite this success, it seems, connection by mail clients doesn't work on ipv 6 - see further below

TLS i/o timeout for both Autoconf and Autodiscover Still on the Check DNS page, I have this error, for both categories TLS connection to hostname "autoconfig.example.com", IP "2a01:[redacted, ipv6]": dial tcp [2a01:[redacted]]:443: i/o timeout

The error message refers to DNS records to set up: it's all set up correctly

errors in log connecting ipv6

When I have a look at my log, I see various cases connecting to ipv4 (apparently ok), and also to ipv6 - always timing out.

Example:

mox-mail-mox-1  | l=info m="new connection" pkg=imapserver remote=[2001:[redacted-home ipv6]]:59106 local=[2a01:[redacted, mox-ipv6]]:993 tls=true listener=public cid=18b843f4327 delta="320.433µs"
mox-mail-mox-1  | l=info m="imap command ioerror" err="reading line from remote: read tcp [2a01:[redacted, mox-ipv6]]:993->[2001:[redacted-home ipv6]]:59106: read: connection reset by peer (fatal io error)" pkg=imapserver cmd= duration=1.147467805s cid=18b843f4327 delta=1.373511795s username=me@example.com
mox-mail-mox-1  | l=info m="connection closed" err="reading line from remote: read tcp [2a01:[redacted, mox-ipv6]]:993->[2001:[redacted-home ipv6]]:59106: read: connection reset by peer (fatal io error)" pkg=imapserver cid=18b843f4327 delta="336.22µs" username=me@example.com

So, it's clear, something is wrong with ipv6. (On the other hand, I'm also not experienced in ipv6, just "dabbling around" in all) I think, the two issues above are somehow related

long time until closing connection sometimes

I've also noticed that the duration of the transaction shows up in the log: in some cases the time until closing the connections seems to be extremely long. This affects not only ipv6, but surprisingly ipv4 connections, as well

Example:

mox-mail-mox-1  | l=info m="new connection" pkg=imapserver remote=[redacted, ipv4 home]:50280 local=[redacted, ipv4, mox]:993 tls=true listener=public cid=18b7ed55dc2 delta="252.838µs"
mox-mail-mox-1  | l=info m="connection closed" pkg=imapserver cid=18b7ed55dc2 delta=9m5.19310387s username=me@example.com

Longest duration of transaction I've seen in the log was about 30m(!). Even though it doesn't disturb now, I'm wondering, is this normal?

Please feel free to let me know if these issues should be split - or if I should choose another way of discussing them.

mjl- commented 1 year ago

ipv6 issues in domain check TLS i/o timeout for both Autoconf and Autodiscover

are you running the openssl command on the VPS, or are you connecting from an external machine? and is mox running in docker? a timeout error often indicates there is a firewall completely dropping traffic. ideally, you would run the nc and openssl commands from the same place (container, vps) that mox is running. if you get a different result from those commands than from mox, we would have to dig deeper in mox. otherwise the most likely cause would be a firewall.

errors in log connecting ipv6

there were no logs snipped out in between those three lines? the errors may seem bad ("connection reset by peer (fatal io error)"), but the last two log lines are normal for connections that are being closed. if there was nothing between the first line and last two lines, that would be strange: it means the connection was set up, but then nothing happened. connections where nothing happens after the initial setup sometimes indicates that one side is expecting tls and the other isn't. some mail clients (perhaps even most!) pick the wrong settings by default...

long time until closing connection sometimes

for the IMAP server, this is normal behaviour. the connections are often long-lived: a mail client connects, fetches the latest emails, and runs the imap IDLE command. it blocks until a new message arrives (or other changes happen to the mailbox). mail clients may close and reopen connections periodically. but a connection could also live for a very long time.

for smtpserver long connections would be unexpected. you normally just connect to smtp to send a message, then close the connection. but the logs only show imapserver.

zkazsi commented 1 year ago

on ipv6 in domain check

are you running the openssl command on the VPS, or are you connecting from an external machine? and is mox running in docker?

I was running the openssl command from an external machine. And yes, mox is running in docker (using docker compose, to be exact)

a timeout error often indicates there is a firewall completely dropping traffic. based on my understanding that would not be the case. I have ufw running, but all necessary ports are allowed (globally).

I also deactivated ufw completely for testing. Based on your hints I tried testing with openssl from within the VPS -> and indeed, in that case it doesn't work (so, would mean, not related to mox?)

I also realized, ipv6 wasn't activated in docker: now I tried to activate it - honestly, didn't help

From what I understand currently, the issue is probably related to the combination of docker and ipv6 - it may be too much (for me), but I'm still looking and trying further

errors in log

there were no logs snipped out in between those three lines?

I believe these were the only lines for the specific transaction (cid): of course much else happening inbetween What I've realized unfortunately in the meantime: it doesn't only affect ipv6, but also sometimes ipv4 connections

Mostly I use mox from several Android devices, set up various clients (K-9 Mail and Fairemail) Here few other logs (again, only for the specific cid, but all transactions for that one): Ex1 - ipv4:

mox-mail-mox-1  | l=info m="new connection" pkg=imapserver remote=157.[home ipv4]:38764 local=45.[mox ipv4]:993 tls=true listener=public cid=18b843f44e4 delta="316.216µs"
mox-mail-mox-1  | l=info m="imap command ioerror" err="reading line from remote: read tcp 45.[mox ipv4]:993->157.[home ipv4]:38764: i/o timeout (fatal io error)" pkg=imapserver cmd= duration=30m0.000402325s cid=18b843f44e4 delta=30m0.936387233s username=user@example.com
mox-mail-mox-1  | l=info m="connection closed" err="reading line from remote: read tcp 45.[mox ipv4]:993->157.[home ipv4]:38764: i/o timeout (fatal io error)" pkg=imapserver cid=18b843f44e4 delta="670.516µs" username=user@example.com

Ex2 - ipv6:

mox-mail-mox-1  | l=info m="new connection" pkg=imapserver remote=[2001:[home ipv6]]:36228 local=[2a01:[mox ipv6]]:993 tls=true listener=public cid=18b843f450e delta="89.601µs"
mox-mail-mox-1  | l=info m="imap command ioerror" err="reading line from remote: read tcp [2a01:[mox ipv6]]:993->[2001:[home ipv6]]:36228: i/o timeout (fatal io error)" pkg=imapserver cmd= duration=30m0.034121512s cid=18b843f450e delta=31m1.626627362s username=user@example.com
mox-mail-mox-1  | l=info m="connection closed" err="reading line from remote: read tcp [2a01:[mox ipv6]]:993->[2001:[home ipv6]]:36228: i/o timeout (fatal io error)" pkg=imapserver cid=18b843f450e delta=1.827707ms username=user@example.com

This is a recurring pattern - 3 lines, not much else.

if there was nothing between the first line and last two lines, that would be strange: it means the connection was set up, but then nothing happened. connections where nothing happens after the initial setup sometimes indicates that one side is expecting tls and the other isn't. some mail clients (perhaps even most!) pick the wrong settings by default...

hmm, thanks for the explanation. I have a feeling (but no specific proof) that it may be affecting K-9 Mail more than Fairemail also, just now checked on desktop, Thunderbird, and it seems to be affected as well unfortunately

finally

for the IMAP server, this is normal behaviour.

great, that's settled then :) I looked into it more: I can see long transactions time only for imapserver, not for smtp

I realize, these may all be basic questions, and not always related to mox specifically. I'm only encouraged by the fact that you want more real life experience, from real use of mox.

mjl- commented 1 year ago

I also deactivated ufw completely for testing. Based on your hints I tried testing with openssl from within the VPS -> and indeed, in that case it doesn't work (so, would mean, not related to mox?)

indeed.

i wouldn't be surprised if there is some kind of issue with docker and ipv6 that breaks normal operation. by disabling ufw, you may have also disabled some docker-firewall-integration...

there were no logs snipped out in between those three lines?

I believe these were the only lines for the specific transaction (cid): of course much else happening inbetween Here few other logs (again, only for the specific cid, but all transactions for that one):

Which log level is mox configured at? You could set it to "trace" to get more details. Perhaps something is happening from one side, but the other is not responding.

I realize, these may all be basic questions, and not always related to mox specifically. I'm only encouraged by the fact that you want more real life experience, from real use of mox.

If you run into these issues, others may run into them as well. In some cases we can make add a warning/documentation or change mox behaviour, so people don't run into those issues or can more easily find a solution. So these reports are helpful, thanks!

zkazsi commented 1 year ago

ipv6 issues

It indeed looks like something with docker and ipv6 was (is) not working: I decided to move away from docker (for the sake of testing) and run mox directly => and it seems to work, also with ipv6

l=info m="new connection" pkg=smtpserver remote=[2607:f8b0:4864:20::a31]:60525 local=[2a01:[mox ipv6]]:25 submission=false tls=false listener=public cid=18b999e61a0 delta="389.068µs"
l=info m="reputation analyzed" mailfrom=sender@gmail.com rcptto=user@example.com pkg=smtpserver conclusive=true isjunk=false method=msgfromfull cid=18b999e61a0 delta=739.213048ms
l=info m="incoming message delivered" mailfrom=sender@gmail.com rcptto=user@example.com pkg=smtpserver reason=msgfromfull msgfrom=sender@gmail.com cid=18b999e61a0 delta=19.832726ms
l=info m="connection closed" pkg=smtpserver cid=18b999e61a0 delta=103.594964ms
l=info m="new connection" pkg=imapserver remote=[2001:home ipv6]:49782 local=[2a01:[mox ipv6]]:993 tls=true listener=public cid=18b999e61a1 delta="307.796µs"
l=info m="connection closed" pkg=imapserver cid=18b999e619f delta=6m3.497069926s username=user@example.com

So, this shows, both smtpserver (for receiving from gmail) and imapserver (from my local ipv6) work well - without docker.

For now I'm running sudo mox serve , in the foreground (which will not be a good solution long term)... I still need to figure out, how to enable the systemd service

I'll also do some further testing with test levels, than I'll see, what's the situation with the timeouts and similar

zkazsi commented 1 year ago

Well, once again it turns out it was kind of a "user error"

I found a hint somewhere regarding ipv6 and ensuring that the /etc/sysctl.conf file does not have any values in place that might be disabling IPv6 connectivity. And indeed, there were:

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

I've changed all values to "0", lo and behold, ipv6 is working (both in docker, and if running directly). ((Even running on host directly, I discovered, I had problems on the domain check page - and of course, all of this is solved now there as well)). So, even though, I didn't set this setting, it was a basic system setting

So, now I'm able to run mox in 2 ways:

I have however new , probably easy questions

installing systemd service

I think I might prefer to run mox directly (instead of docker) for latest improvements, like DNSSEC and DANE. when running it this way, I currently use screen (to run in the background): seems to run fine for now, however I believe the more preferred way for services to run is via systemd services. On installation mox created a mox.services file, which I somehow just can't activate.

Copied over to /etc/systemd (sudo cp mox.services /etc/systemd) and enabled and started it - however, systemd status mox always show the service is failing / constantly restarting:

mox@mail:~$ sudo systemctl status mox
● mox.service - mox mail server
     Loaded: loaded (/etc/systemd/system/mox.service; enabled; vendor preset: enabled)
     Active: activating (auto-restart) (Result: exit-code) since Sat 2023-11-04 14:39:16 UTC; 4s ago
    Process: 3649 ExecStart=/home/mox/mox serve (code=exited, status=203/EXEC)
   Main PID: 3649 (code=exited, status=203/EXEC)
        CPU: 78ms

Any ideas what I could be doing wrong? I was checking permissions, it has same permissions as other service files in /etc/systemd/system: -rw-r--r-- 1 root root 1325 Nov 4 06:31 mox.service

[Update: later on changed to permission 644: no difference]

Hopefully last question for a while

mjl- commented 1 year ago

The error is likely about /home/mox/mox not having execute permissions. I suppose running it as root (with "mox serve") does work? When mox starts through systemd, it also starts as root, then drops privileges.

A quick search leads me to https://unix.stackexchange.com/questions/472950/systemd-status-203-exec-error-when-creating-new-service One tip stands out: it could be selinux. Whenever something that obviously should work, does not work on linux, and selinux is enabled, that's a prime suspect.

You can probably get a few more log lines with: journalctl -n 100 -u mox.

zkazsi commented 1 year ago

Once more, thanks for your hints: it helped, and is now solved. It was actually a path issue. (after building it, I moved mox to /usr/local/bin, not to /home/mox for some reason - however this was of course not changed in mox.service. Anyways, now working well.

I go back to playing with it. My next plans related to mox:

Normally, I don't think these should be too difficult ... however you never know. Thanks again for your patience so far :)

mjl- commented 1 year ago

set up properly DNSSEC and DANE (w/authoritative dns running on same server)

i recently started running dns myself too. you should have at least 2 machines doing dns, for redundancy. perhaps it's even required when updating the name servers at the registrar.

i settled on using bind: it has automatically dnssec signing and key management and is relatively straight-forward, the signed zones can easily be automatically propagated to a second bind instance. for alternatives, i also looked at nsd, but i think it needs separate tools (with cronjobs) to generate signed zones. powerdns should be automatic, but also seems a bit complex and too much for my needs. knot was also an option, but seems younger and less well-known.

when migrating, think about the TTL's of your records, including NS records at your current DNS operator and at the TLD.

zkazsi commented 1 year ago

Hi Mechiel!

thanks for he hints, I've started running it already for few weeks for this domain: I use knot-dns (authoritative), and it's working well. There was only DNSSEC and DANE settings remaining, but have been able to solve and set up correctly (actually, the issue was more related to recursive DNS, I've installed unbound, as recommended on Domain Check) => Domain Check shows all green :)

What remains still open for now however: those i/o errors: I'll check if the changes in the meantime have helped + will play more with log levels, to see if there's

Also, one more idea came to me: since now I run mox on host, yet would like to use docker maybe for other things, I will need to dig into mox webserver (e.g. connecting host webserver with docker containers for reverse proxying)

Will update here (if I have progress).