ONLYOFFICE / Docker-CommunityServer

Collaborative system for managing documents, projects, customer relations and emails in one place
MIT License
494 stars 180 forks source link

Cannot resolve MX record on mailserver container (but host VM can) #120

Open sbosshardt opened 3 years ago

sbosshardt commented 3 years ago

Summary

I'm thinking the hostname of the mail server container is preventing successful resolution of the MX record. This may be the same issue as #86 but I didn't wan't to convolute the existing discussion.

I think this is a bug, but I'd welcome a second opinion/advice. I'm thinking of changing the hostname of the onlyoffice-mail-server container's system next. I hesitate to because I assume that there's a reason why the hostname was set that way (e.g. having to do with SMTP HELO/EHLO).

Background

Greenfield scenario: I installed OnlyOffice Workspace onto a fresh VPS server, for a fresh domain name. (execreations.com)

wget https://download.onlyoffice.com/install/workspace-install.sh
bash workspace-install.sh -md "execreations.com"

Just about everything on ONLYOFFICE seems to be working well, except for mail. The DNS check in the ONLYOFFICE Mail Server settings shows the MX record is not correctly configured. The other two records (TXT) it asked for are verified as correct (green checkmarks). I'd fiddled around with DNS records for a long time trying to get the MX to be accepted, ruling out caching as the problem and even trying moving my nameservers to another provider. No matter what, the settings validator stubbornly would not detect the record.

If, on my host VM, I try dig execreations.com MX or dig @8.8.8.8 execreations.com MX, the record is resolved.

root@vps2245225:~# dig execreations.com MX +short
0 execreations.com.
root@vps2245225:~# dig @8.8.8.8 execreations.com MX +short
0 execreations.com.
root@vps2245225:~# hostname
vps2245225
root@vps2245225:~# domainname
(none)

If I attach to the onlyoffice-mail-server container and try the commands, I'm unable to resolve without an external DNS server specified.

root@vps2245225:~# docker exec -it d84f215453e3 bash
[root@execreations /]# dig execreations.com MX +short
[root@execreations /]# dig @8.8.8.8 execreations.com MX +short
0 execreations.com.
[root@execreations /]# hostname
execreations.com
[root@execreations /]# domainname
(none)

When I try a trace,

[root@execreations /]# dig execreations.com MX +trace

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.68.rc1.el6_10.7 <<>> execreations.com MX +trace
;; global options: +cmd
;; Received 17 bytes from 127.0.0.11#53(127.0.0.11) in 0 ms

[root@execreations /]# dig @8.8.8.8 execreations.com MX +trace

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.68.rc1.el6_10.7 <<>> @8.8.8.8 execreations.com MX +trace
; (1 server found)
;; global options: +cmd
.           52353   IN  NS  a.root-servers.net.
.           52353   IN  NS  b.root-servers.net.
.           52353   IN  NS  c.root-servers.net.
.           52353   IN  NS  d.root-servers.net.
.           52353   IN  NS  e.root-servers.net.
.           52353   IN  NS  f.root-servers.net.
.           52353   IN  NS  g.root-servers.net.
.           52353   IN  NS  h.root-servers.net.
.           52353   IN  NS  i.root-servers.net.
.           52353   IN  NS  j.root-servers.net.
.           52353   IN  NS  k.root-servers.net.
.           52353   IN  NS  l.root-servers.net.
.           52353   IN  NS  m.root-servers.net.
;; Received 228 bytes from 8.8.8.8#53(8.8.8.8) in 8 ms

;; Truncated, retrying in TCP mode.
com.            172800  IN  NS  a.gtld-servers.net.
com.            172800  IN  NS  b.gtld-servers.net.
com.            172800  IN  NS  c.gtld-servers.net.
com.            172800  IN  NS  d.gtld-servers.net.
com.            172800  IN  NS  e.gtld-servers.net.
com.            172800  IN  NS  f.gtld-servers.net.
com.            172800  IN  NS  g.gtld-servers.net.
com.            172800  IN  NS  h.gtld-servers.net.
com.            172800  IN  NS  i.gtld-servers.net.
com.            172800  IN  NS  j.gtld-servers.net.
com.            172800  IN  NS  k.gtld-servers.net.
com.            172800  IN  NS  l.gtld-servers.net.
com.            172800  IN  NS  m.gtld-servers.net.
;; Received 830 bytes from 192.58.128.30#53(192.58.128.30) in 9 ms

execreations.com.   172800  IN  NS  ns3.epik.com.
execreations.com.   172800  IN  NS  ns4.epik.com.
;; Received 107 bytes from 192.26.92.30#53(192.26.92.30) in 132 ms

execreations.com.   300 IN  MX  0 execreations.com.
;; Received 66 bytes from 52.55.168.70#53(52.55.168.70) in 70 ms

On a side note, I wanted to express my complements to the dev team. I've only recently begun to evaluate/use ONLYOFFICE Workspace and so far it looks great.

Carazyda commented 3 years ago

Hi @sbosshardt What OS are you using on the host? Any firewall enabled? Have you tried restarting docker container with onlyoffice-mail server? In the mail server portal interface, when you add your own domain, the MX record is not checked in green? Are mailboxes being created? Can you send and receive mail?

sbosshardt commented 3 years ago

Hi @Carazyda,

Thanks for the troubleshooting questions. Prior to your comment, I found a workaround and am able to send and receive test messages. I tried the steps in the following documentation and it got me to a point where the MX record would validate: https://helpcenter.onlyoffice.com/installation/mail-change-domain.aspx

The domain I changed to was "mail.execreations.com", and I created a corresponding MX record for it. The workaround got me to a point where I was able to create email accounts, and to send and receive mail. I didn't get rid of my original MX record for "execreations.com" (I figured that the Let's Encrypt certificate created by ONLYOFFICE doesn't cover the "mail" subdomain).

What OS are you using on the host? OS: Ubuntu 20.04.2 LTS

Any firewall enabled? I don't think so. I hadn't made any changes to default settings. I'm not too savvy on Linux firewalls, but here's some output which might provide insight:

root@vps2245225:~# ufw status
Status: inactive
root@vps2245225:~# iptables --list
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain FORWARD (policy DROP)
target     prot opt source               destination         
DOCKER-USER  all  --  anywhere             anywhere            
DOCKER-ISOLATION-STAGE-1  all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED
DOCKER     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED
DOCKER     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         

Chain DOCKER (2 references)
target     prot opt source               destination         
ACCEPT     tcp  --  anywhere             172.18.0.3           tcp dpt:tproxy
ACCEPT     tcp  --  anywhere             172.18.0.3           tcp dpt:sieve
ACCEPT     tcp  --  anywhere             172.18.0.3           tcp dpt:pop3s
ACCEPT     tcp  --  anywhere             172.18.0.5           tcp dpt:xmpp-client
ACCEPT     tcp  --  anywhere             172.18.0.3           tcp dpt:imaps
ACCEPT     tcp  --  anywhere             172.18.0.5           tcp dpt:https
ACCEPT     tcp  --  anywhere             172.18.0.3           tcp dpt:submission
ACCEPT     tcp  --  anywhere             172.18.0.5           tcp dpt:http
ACCEPT     tcp  --  anywhere             172.18.0.3           tcp dpt:submissions
ACCEPT     tcp  --  anywhere             172.18.0.3           tcp dpt:imap2
ACCEPT     tcp  --  anywhere             172.18.0.3           tcp dpt:smtp

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target     prot opt source               destination         
DOCKER-ISOLATION-STAGE-2  all  --  anywhere             anywhere            
DOCKER-ISOLATION-STAGE-2  all  --  anywhere             anywhere            
RETURN     all  --  anywhere             anywhere            

Chain DOCKER-ISOLATION-STAGE-2 (2 references)
target     prot opt source               destination         
DROP       all  --  anywhere             anywhere            
DROP       all  --  anywhere             anywhere            
RETURN     all  --  anywhere             anywhere            

Chain DOCKER-USER (1 references)
target     prot opt source               destination         
RETURN     all  --  anywhere             anywhere    

By the way, since you asked about firewall, I went ahead and used netstat -tln to view all the listening ports. I found three that I'm unsure about exposing to the Internet. Does ONLYOFFICE intentionally make these available? If not, I'll consider firewalling them.

Have you tried restarting docker container with onlyoffice-mail server? I had tried rebooting the host OS many times (which would include all running docker containers).

In the mail server portal interface, when you add your own domain, the MX record is not checked in green? Correct - Only after I'd done the workaround ("Changing domain used with ONLYOFFICE Mail" article) was I able to get the MX record to show with a green checkmark.

Are mailboxes being created? Can you send and receive mail? I did not try prior to the workaround. After the workaround, I was able to.

tuliogs commented 2 years ago

Hi! Before anything, thanks and congratulations to the team for this amazing piece of software. I bumped into this same problem, and I think it might indeed be related to #86 , but not necessarily caused by it. These are actually 3 potentially related bugs that caught my attention:

1

The reasoning: I too am getting a "duplicate entry" message for cluebringer's init script upon server start:

ERROR 1062 (23000) at line 3 in file: '/tmp/cluebringer_init_sql.3251110345': Duplicate entry 'SenderIP:172.20.0.6' for key 'Source'

Indeed, lines 3 and 4 of said script reference the server's name twice, with different addresses:

INSERT INTO greylisting_whitelist (Source, Comment, Disabled) VALUES ("SenderIP:172.20.0.5", 'my.redacted.domain', 0); INSERT INTO greylisting_whitelist (Source, Comment, Disabled) VALUES ("SenderIP:my.redacted.IP", 'my.redacted.domain', 0);

I suppose it's getting these address from Docker's DNS resolver (private network, which is correct) and an external DNS. The redacted IP in the 2nd line is my actual IP as per my domain's config, which is configured correctly. I suppose this would also happen with the SPF and TXT registers if they existed in Docker's internal DNS, but here I'm just guessing.

What is surely of importance is: those fields already exist in the database, so I suppose the script shouldn't try to add them again, that's why the error. Removing the records already in the database fixed the error message.

2

I've also noticed that I can add, but not remove domains from the web Interface: it removes the "deleted" domain from the screen, but if I go to another page and back it will still be there. It will also show a "domain already exists" or similar if I try to add the "deleted" domain again. Web.sql.log does show a "delete" command being sent to the DB server, though. No idea why it doesn't get through. Hint: reloading the page instead of back-and-forth doesn't work, it will show the page without the "deleted" domain. I need to exit the "Mail Server" page and come back to see that the domains are still there.

3

Tcpdump only shows DNS queries from Community Server for the A, SPF and RSA TXT keys to the external server, but never MX. Still, it's only sometimes and at random, not when I click on "Verify" in "DNS settings". It simply never checks for MX registers for the configured address in Mail Server.

Now look at these results:

root@redactedMailServer:/# dig @127.0.0.11 my.redacted.domain. mx

; <<>> DiG 9.11.3-1ubuntu1.16-Ubuntu <<>> @127.0.0.11 my.redacted.domain. mx
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36428
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;my.redacted.domain.              IN      MX

;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Fri Mar 04 02:16:53 UTC 2022
;; MSG SIZE  rcvd: 34
root@redactedMailServer:/# dig @127.0.0.11 my.redacted.domain. txt

; <<>> DiG 9.11.3-1ubuntu1.16-Ubuntu <<>> @127.0.0.11 my.redacted.domain. txt
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43970
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;my.redacted.domain.              IN      TXT

;; ANSWER SECTION:
my.redacted.domain.       900     IN      TXT     "v=spf1 +mx ~all"

;; Query time: 701 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Fri Mar 04 02:17:15 UTC 2022
;; MSG SIZE  rcvd: 73

What happens is: Docker's internal DNS returns no MX record and most importantly doesn't forward the request to the external server if the host has the same name as the queried domain. It does, however, forward the other queries for some esoteric reason. Community Server just picks Docker's response and does not try to check with an external server. So there's a chicken-egg situation:

Using another host name during install breaks other functions, such as the "Let's Encrypt" certificate, that OO seems to take from the host specified upon container creation.

TL;DR: the root cause seems to be Docker's broken treatment to MX, but OOMS/CS should have a way to forward requests to external DNS (even if we should specify it manually somewhere) to confirm the MX record. The workaround suggested by OP creates other issues.

TL;DR #2: AAAARGHHH...

tuliogs commented 2 years ago

Update: the only workaround that worked was to erase Community Server's container, recreate it with --dns pointing to an external server and, before doing anything else, altering the container's /etc/resolv.conf to also point to that external server. Back to the "Mail Server" screen, it indeed looked up for the MX record and validated the domain.

Recreating the container also got rid of the deleted domains.

(Edit 2022.03.05: nevermind the next paragraphs, they were due to an entirely unrelated issue, though they could have been avoided if the MX thing had worked properly.)

Still can't send emails, though. Internal addresses fail with Reason: "Connection reset by peer" External addresses fail with Reason: "Smtp.ConnectAsync timeout"

Back to the drawing boards.

tuliogs commented 2 years ago

Got it handled. We're talking about a bunch of different stuff, though some might be related. So I'll stick to the issue in OP.

TL;DR:

  1. OO will force us to set the MX record to the Mail Server's (OOMS) container's name;
  2. Docker will NEVER resolve the MX record because it never does - it only resolves addresses ("A" registers). If it's a container's hostname, Docker will return the address (A), but say the MX is non-existent without even trying to forward the query to external servers;
  3. OO then takes the empty reply and assumes the MX hasn't been published, never validating the domain. You CANNOT set a different hostname, i.e. one that isn't recognized by Docker
  4. You have to resort to dirty little tricks to get out of this loop.

Long explanation

It's not an OnlyOffice bug, but rather a limitation, coupled with Docker´s own limitation. Docker will automatically respond to DNS queries about its containers, so if you set the MX to the same as the hostname, Docker will respond an empty record to MX queries for that name, because it only holds the address (A) record.

In other words, if you're leaving DNS to Docker, it will never resolve MX queries that are the same name as one of its containers - which is more than a burden if you don't want to set up a full DNS setup and the nightmare it ensues. E.g., considering that my.domain.com is a container's hostname under Docker:

dig my.domain.com A

will see Docker respond with the machine's actual internal IP but:

dig my.domain.com MX

will have Docker respond an empty reply, because it does NOT handle MX records, nor will forward it to the external server (and we can't even tell it to do so, which is dumb).

On the other hand, OnlyOffice should predict this and NOT force us to set the MX record to the hostname, which doesn't make sense either. For example, if I have a domain called example1.com and my container is named bah.abhlabhla.com, OO will force me to add an MX record to bah.abhlabhla.com, which totally defeats the purpose of internal naming. Even worse: will NEVER recognize the MX record, unless you spoof your own DNS server (i.e. if you're able to).

So there should be a way to specify another hostname for the MX record, NOT just the machine's name.

THE Workaround (Capital Letters)

In my case, the only way I got it to work properly was to set up a spoof Dnsmasq server for the specific domain I'm setting up and point both OOMS and OOCS to it. The resulting config was something like:

/etc/dnsmasq.d/fakenet.conf:

mx-host=fakenet.local,my.redacted.domain,0      # To force OO to recognize the MX record
mx-host=fakenet.local,server1.fakenet.local.,10     # the fake net I had to set up
mx-host=fakenet.local,server2.fakenet.local.,20
address=/server1.fakenet.local./172.20.0.4      # This is the same OOMS machine as
address=/server1.my.redacted.comain./172.20.0.4     # this one, that I needed to spoof.
address=/server2.fakenet.local./172.20.0.6      # This is OO Community Server, who makes the MX queries.
txt-record=fakenet.local.,"v=spf1 +mx ~all"     # Requested by OnlyOffice.
txt-record=dkim._domainkey.fakenet.local,"k=rsa; ***MY-VERY-MERRY-LONG-KEY-BY-OO ***  "

(Edit: I suppose you'll only need spoofing the MX record, as DNSMasq will forward any queries it doesn't know to the downstream DNS servers - unlike dumb Docker. I'll try that later when I'm rested. :P )

The only reason for me to create this fake net mess is just so I don't compromise my actual domain while I'm still setting up both OO and DNSmasq before sending them to production. When it's all done, I'll still need to set up a full internal DNS setting just so OO can find the MX record - and NO other reason, given that Docker already provides all internal DNS functions I need for daily operations. (Granted: a simpler, non-critical setup could skip the fake net and spoof only the production domain)

All that work would be eliminated if we simply had a way to set up the MX record independently of the internal machine name we use for OOMS (OnlyOffice Mail Server), so it can bypass Docker's (and potentially others') limitations.

Can we have this included in a future release, please?

It would spare tons of time and frustration; I've been fighting this issue for over a week, because it coincided with 2 other issues with email alone.

Poloxin commented 1 year ago

Hi, thanks for such a detailed answer. How relevant is this now? OO developers don't fix the problem?

tuliogs commented 1 year ago

Gee, that's a good question. It's been so long I didn't even remember about the issue, and those servers have been replaced anyway. Ironically, I may need to install another server in the coming days, but please don't count on that - it'll only happen if the competitor system doesn't work as expected.

To answer your last question, though, I never heard anything from the devs.

Poloxin commented 1 year ago

thanks for the reply. I will seek help on the official forum. good luck

Отправлено из Mail.ru для Android понедельник, 21 ноября 2022г., 20:00 +04:00 от tuliogs @.*** :

Gee, that's a good question. It's been so long I didn't even remember about the issue, and those servers have been replaced anyway. Ironically, I may need to install another server in the coming days, but please don't count on that - it'll only happen if the competitor system doesn't work as expected. To answer your last question, though, I never heard anything from the devs. — Reply to this email directly, view it on GitHub , or unsubscribe . You are receiving this because you commented. Message ID: @ github . com>

awnz commented 12 months ago

Dumping my notes here as it seems as good a place as any. I have encountered something with similar behaviour, but a different cause.

When trying to configure my first domain in Mail, I got a value of "." for MX "Text/Value" in the Add domain wizard as below: image

And it would fail with "Internal server error. Try again later", and looking at the browser debug console, I could see the server was returning a 400 "Value cannot be null. Parameter name: mxRecord" image

I worked around this by doing: docker exec -it onlyoffice-mysql-server mysql -pmy-secret-pw -e "use onlyoffice; update mail_server_dns set mx='mx.my.domain';" (where "mx.my.domain" is the A record I'd set up for the reverse proxy I have Onlyoffice sitting behind).

This showed up in the web UI, and I hit a new failure: "Unknown database 'onlyoffice'mailserver'" image

I've been getting that same error the whole time if trying to (re)connect the mail server under Settings > Integration > Mail Server. And indeed, that database does not exist on the database host onlyoffice-mysql-server.

So it seems at some point the mail server database failed to initialize.

Looking into that:

root@onlyoffice:~# docker exec -it onlyoffice-mail-server bash
[root@office /]# mysql -h $MYSQL_SERVER -u $MYSQL_ROOT_USER --password=$MYSQL_ROOT_PASSWD 
ERROR 1251 (08004): Client does not support authentication protocol requested by server; consider upgrading MySQL client

One forum result for that error: https://forum.onlyoffice.com/t/getting-a-error-in-authentication-sql-server-does-not-support-it-cant-use-mail-server/6425/5

Indeed, mysql version mismatch between onlyoffice-mysql-server and onlyoffice-mail-server:

# in onlyoffice-mail-server
root@onlyoffice:~# docker exec -it onlyoffice-mail-server mysql --version
mysql  Ver 14.14 Distrib 5.1.73, for redhat-linux-gnu (x86_64) using readline 5.1

# in onlyoffice-mysql-server
root@onlyoffice:~# docker exec -it onlyoffice-mysql-server mysql --version
mysql  Ver 8.0.29 for Linux on x86_64 (MySQL Community Server - GPL)

# the community server
root@onlyoffice:~# docker exec -it onlyoffice-community-server mysql --version
mysql  Ver 8.0.33-0ubuntu0.22.04.2 for Linux on x86_64 ((Ubuntu))

So mysql in the mail server for some reason is really old.

This is installed on ubuntu-22.04 using workspace-install.sh and Docker reports I'm on the latest version of onlyoffice/mailserver:1.6.75.

The workaround was to enable mysql_native_password authentication so that the mailserver could connect, then remove and reinstall the mail server so it correctly initializes its database.

I've outlined how I did that in the forum here: https://forum.onlyoffice.com/t/getting-a-error-in-authentication-sql-server-does-not-support-it-cant-use-mail-server/6425/9?u=awnz

KaKi87 commented 9 months ago

Hello, In the end, what's the simplest and cleanest way to solve this ? Thanks