mailcow / mailcow-dockerized

mailcow: dockerized - 🐮 + 🐋 = 💕
https://mailcow.email
GNU General Public License v3.0
9.03k stars 1.18k forks source link

Dovecot fails to start when sni.conf has too many entries #5486

Open awsumco opened 1 year ago

awsumco commented 1 year ago

Contribution guidelines

I've found a bug and checked that ...

Description

On a Mailcow setup where there is a large amount of hosted domains (1500+) dovecot fails to start due to sni.conf either having to many local_name enties or file size is to large to load into memory.

The "fix" was to rewrite the local_name to include all domains in one line from the domains file that is created by ACME.

For example the way that sni.conf outputs the following line's of config code:

 autoconfig.example.com {
  ssl_cert = </etc/ssl/mail/subdomain.example.com/cert.pem
  ssl_key = </etc/ssl/mail/subdomain.example.com/key.pem
}

I suggest that all domains are placed in line, thus the certificate is only loaded once per hosted domains like below.

local_name "autoconfig.example.com autodiscover.example.com san.example.com" {
  ssl_cert = </etc/ssl/mail/subdomain.example.com/cert.pem
  ssl_key = </etc/ssl/mail/subdomain.example.com/key.pem
}

Doing the above, got dovecot started and working.

The pro's being:

Logs:

mailcowdockerized-dovecot-mailcow-1  | Uptime: 386  Threads: 78  Questions: 58361  Slow queries: 0  Opens: 113  Open tables: 104  Queries per second avg: 151.194
mailcowdockerized-dovecot-mailcow-1  | The user `vmail' is already a member of `tty'.
mailcowdockerized-dovecot-mailcow-1  |   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
mailcowdockerized-dovecot-mailcow-1  |                                  Dload  Upload   Total   Spent    Left  Speed
100  112k  100  112k    0     0  67885      0  0:00:01  0:00:01 --:--:-- 67845
mailcowdockerized-dovecot-mailcow-1  | 20_blatspammer.cf
mailcowdockerized-dovecot-mailcow-1  | 70_HS_body.cf
mailcowdockerized-dovecot-mailcow-1  | 70_HS_header.cf
mailcowdockerized-dovecot-mailcow-1  | 2023-10-22 10:31:59,329 INFO Set uid to user 0 succeeded
mailcowdockerized-dovecot-mailcow-1  | 2023-10-22 10:31:59,333 INFO supervisord started with pid 1
mailcowdockerized-dovecot-mailcow-1  | 2023-10-22 10:32:00,342 INFO spawned: 'processes' with pid 1540
mailcowdockerized-dovecot-mailcow-1  | 2023-10-22 10:32:00,348 INFO spawned: 'dovecot' with pid 1541
mailcowdockerized-dovecot-mailcow-1  | 2023-10-22 10:32:00,362 INFO spawned: 'syslog-ng' with pid 1542
mailcowdockerized-dovecot-mailcow-1  | [2023-10-22T10:32:00.443352] WARNING: With use-dns(no), dns-cache() will be forced to 'no' too!;
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:00 49201172b5e5 syslog-ng[1542]: syslog-ng starting up; version='3.28.1'
mailcowdockerized-dovecot-mailcow-1  | 2023-10-22 10:32:01,457 INFO success: processes entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
mailcowdockerized-dovecot-mailcow-1  | 2023-10-22 10:32:01,457 INFO success: dovecot entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
mailcowdockerized-dovecot-mailcow-1  | 2023-10-22 10:32:01,457 INFO success: syslog-ng entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:16 49201172b5e5 dovecot: master: Dovecot v2.3.21 (47349e2482) starting up for imap, sieve, lmtp, pop3
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:21 49201172b5e5 dovecot: replicator: Error: conn unix:/run/dovecot/stats-writer (pid=1541,uid=0): Timeout waiting for handshake response
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:21 49201172b5e5 dovecot: managesieve-login: Error: conn unix:stats-writer (pid=1541,uid=0): Timeout waiting for handshake response
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:21 49201172b5e5 dovecot: managesieve-login: Error: conn unix:stats-writer (pid=1541,uid=0): Timeout waiting for handshake response
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:23 49201172b5e5 dovecot: imap-login: Error: conn unix:stats-writer (pid=1541,uid=0): Timeout waiting for handshake response
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:23 49201172b5e5 dovecot: lmtp: Error: conn unix:/run/dovecot/stats-writer (pid=1541,uid=0): Timeout waiting for handshake response
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:26 49201172b5e5 dovecot: master: Error: service(anvil): command startup failed, throttling for 2.000 secs
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:26 49201172b5e5 dovecot: anvil: Fatal: Error reading configuration: read(/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:26 49201172b5e5 dovecot: stats: Fatal: Error reading configuration: read(/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:26 49201172b5e5 dovecot: master: Error: service(stats): command startup failed, throttling for 2.000 secs
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:31 49201172b5e5 dovecot: replicator: Fatal: Error reading configuration: read(/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:31 49201172b5e5 dovecot: master: Error: service(replicator): command startup failed, throttling for 2.000 secs
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:31 49201172b5e5 dovecot: master: Error: service(managesieve-login): command startup failed, throttling for 2.000 secs
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:31 49201172b5e5 dovecot: managesieve-login: Fatal: Error reading configuration: read(/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:31 49201172b5e5 dovecot: managesieve-login: Fatal: Error reading configuration: read(/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:33 49201172b5e5 dovecot: imap-login: Fatal: Error reading configuration: read(/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:33 49201172b5e5 dovecot: master: Error: service(imap-login): command startup failed, throttling for 2.000 secs
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:33 49201172b5e5 dovecot: lmtp: Fatal: Error reading configuration: read(/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:33 49201172b5e5 dovecot: master: Error: service(lmtp): command startup failed, throttling for 2.000 secs
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:34 49201172b5e5 dovecot: config: Fatal: block_alloc(268435456): Out of memory
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:34 49201172b5e5 dovecot: config: Error: Raw backtrace: /usr/lib/dovecot/libdovecot.so.0(backtrace_append+0x42) [0x7f8164b8a482] -> /usr/lib/dovecot/libdovecot.so.0(backtrace_get+0x1e) [0x7f8164b8a59e] -> /usr/lib/dovecot/libdovecot.so.0(+0x1041fb) [0x7f8164b971fb] -> /usr/lib/dovecot/libdovecot.so.0(+0x104291) [0x7f8164b97291] -> /usr/lib/dovecot/libdovecot.so.0(+0x5685f) [0x7f8164ae985f] -> /usr/lib/dovecot/libdovecot.so.0(+0x5ae57) [0x7f8164aede57] -> /usr/lib/dovecot/libdovecot.so.0(+0x1243cc) [0x7f8164bb73cc] -> /usr/lib/dovecot/libdovecot.so.0(p_strconcat+0x11e) [0x7f8164bcd43e] -> dovecot/config(+0x109e1) [0x5599ca3a09e1] -> dovecot/config(+0x11121) [0x5599ca3a1121] -> /usr/lib/dovecot/libdovecot.so.0(settings_check+0x4e) [0x7f8164b270de] -> /usr/lib/dovecot/libdovecot.so.0(settings_parser_check+0x53) [0x7f8164b27283] -> dovecot/config(config_parse_file+0xd0a) [0x5599ca3a615a] -> dovecot/config(main+0x9a) [0x5599ca39eefa] -> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea) [0x7f8164799d0a] -> dovecot/config(_start+0x2a) [0x5599ca39efaa]
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:34 49201172b5e5 dovecot: master: Error: service(config): command startup failed, throttling for 2.000 secs
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:34 49201172b5e5 dovecot: config: Fatal: master: service(config): child 1554 returned error 83 (Out of memory (service config { vsz_limit=1024 MB }, you may need to increase it) - set CORE_OUTOFMEM=1 environment to get core dump)
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:34 49201172b5e5 dovecot: pop3-login: Fatal: Error reading configuration: read(/run/dovecot/config) failed: read(size=8192) failed: Connection reset by peer - Also failed to read config by executing doveconf: /run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:34 49201172b5e5 dovecot: auth: Fatal: Error reading configuration: read(/run/dovecot/config) failed: read(size=8192) failed: Connection reset by peer - Also failed to read config by executing doveconf: /run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:34 49201172b5e5 dovecot: master: Error: service(auth): command startup failed, throttling for 2.000 secs
mailcowdockerized-dovecot-mailcow-1  | Oct 22 10:32:34 49201172b5e5 dovecot: master: Error: service(pop3-login): command startup failed, throttling for 2.000 secs

Steps to reproduce:

1. Have over 1500+ domains
2. Have extra san of mail.*
3. Restart dovecot-mailcow so sni.conf gets created.

Which branch are you using?

master

Operating System:

Debian 10.13

Server/VM specifications:

16Gb / 4 CPU Cores

Is Apparmor, SELinux or similar active?

no

Virtualization technology:

Hyper-V

Docker version:

24.0.6

docker-compose version or docker compose version:

v2.21.0

mailcow version:

2023-10a

Reverse proxy:

NA

Logs of git diff:

diff --git a/data/Dockerfiles/dovecot/docker-entrypoint.sh b/data/Dockerfiles/dovecot/docker-entrypoint.sh
index b2633c27..b19ee123 100755
--- a/data/Dockerfiles/dovecot/docker-entrypoint.sh
+++ b/data/Dockerfiles/dovecot/docker-entrypoint.sh
@@ -270,12 +270,12 @@ for cert_dir in /etc/ssl/mail/*/ ; do
     continue
   fi
   domains=($(cat ${cert_dir}domains))
-  for domain in ${domains[@]}; do
-    echo 'local_name '${domain}' {' >> /etc/dovecot/sni.conf;
-    echo '  ssl_cert = <'${cert_dir}'cert.pem' >> /etc/dovecot/sni.conf;
-    echo '  ssl_key = <'${cert_dir}'key.pem' >> /etc/dovecot/sni.conf;
-    echo '}' >> /etc/dovecot/sni.conf;
-  done
+cat <<EOF >> /etc/dovecot/sni.conf
+local_name "${domains[@]}" {
+  ssl_cert = <${cert_dir}cert.pem
+  ssl_key = <${cert_dir}key.pem
+}
+EOF
 done

Logs of iptables -L -vn:

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain FORWARD (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
9861K   11G DOCKER-USER  all  --  *      *       0.0.0.0/0            0.0.0.0/0
9861K   11G DOCKER-ISOLATION-STAGE-1  all  --  *      *       0.0.0.0/0            0.0.0.0/0
6878K 7394M ACCEPT     all  --  *      br-mailcow  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
 274K   18M DOCKER     all  --  *      br-mailcow  0.0.0.0/0            0.0.0.0/0
2708K 3750M ACCEPT     all  --  br-mailcow !br-mailcow  0.0.0.0/0            0.0.0.0/0
 212K   15M ACCEPT     all  --  br-mailcow br-mailcow  0.0.0.0/0            0.0.0.0/0

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain DOCKER (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 ACCEPT     tcp  --  !br-mailcow br-mailcow  0.0.0.0/0            172.22.1.2           tcp dpt:8983
    0     0 ACCEPT     tcp  --  !br-mailcow br-mailcow  0.0.0.0/0            172.22.1.249         tcp dpt:6379
    0     0 ACCEPT     tcp  --  !br-mailcow br-mailcow  0.0.0.0/0            172.22.1.6           tcp dpt:3306
13809  819K ACCEPT     tcp  --  !br-mailcow br-mailcow  0.0.0.0/0            172.22.1.8           tcp dpt:443
 2605  152K ACCEPT     tcp  --  !br-mailcow br-mailcow  0.0.0.0/0            172.22.1.8           tcp dpt:80
    0     0 ACCEPT     tcp  --  !br-mailcow br-mailcow  0.0.0.0/0            172.22.1.250         tcp dpt:12345
    2   120 ACCEPT     tcp  --  !br-mailcow br-mailcow  0.0.0.0/0            172.22.1.250         tcp dpt:4190
 3969  217K ACCEPT     tcp  --  !br-mailcow br-mailcow  0.0.0.0/0            172.22.1.250         tcp dpt:995
12977  783K ACCEPT     tcp  --  !br-mailcow br-mailcow  0.0.0.0/0            172.22.1.250         tcp dpt:993
 2621  158K ACCEPT     tcp  --  !br-mailcow br-mailcow  0.0.0.0/0            172.22.1.250         tcp dpt:143
  521 31544 ACCEPT     tcp  --  !br-mailcow br-mailcow  0.0.0.0/0            172.22.1.250         tcp dpt:110
    6   312 ACCEPT     tcp  --  !br-mailcow br-mailcow  0.0.0.0/0            172.22.1.253         tcp dpt:588
  700 43096 ACCEPT     tcp  --  !br-mailcow br-mailcow  0.0.0.0/0            172.22.1.253         tcp dpt:587
 1335 75088 ACCEPT     tcp  --  !br-mailcow br-mailcow  0.0.0.0/0            172.22.1.253         tcp dpt:465
 1742  103K ACCEPT     tcp  --  !br-mailcow br-mailcow  0.0.0.0/0            172.22.1.253         tcp dpt:25

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
 pkts bytes target     prot opt in     out     source               destination
2708K 3750M DOCKER-ISOLATION-STAGE-2  all  --  br-mailcow !br-mailcow  0.0.0.0/0            0.0.0.0/0

Chain DOCKER-ISOLATION-STAGE-2 (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DROP       all  --  *      br-mailcow  0.0.0.0/0            0.0.0.0/0

Chain DOCKER-USER (1 references)
 pkts bytes target     prot opt in     out     source               destination
  22M   18G RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0
# Warning: iptables-legacy tables present, use iptables-legacy to see them

Logs of ip6tables -L -vn:

Chain DOCKER-USER (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 RETURN     all      *      *       ::/0                 ::/0

Chain DOCKER (1 references)
 pkts bytes target     prot opt in     out     source               destination

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DOCKER-ISOLATION-STAGE-2  all      br-mailcow !br-mailcow  ::/0                 ::/0
    0     0 RETURN     all      *      *       ::/0                 ::/0

Chain DOCKER-ISOLATION-STAGE-2 (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DROP       all      *      br-mailcow  ::/0                 ::/0
    0     0 RETURN     all      *      *       ::/0                 ::/0

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DOCKER-USER  all      *      *       ::/0                 ::/0
    0     0 DOCKER-ISOLATION-STAGE-1  all      *      *       ::/0                 ::/0
    0     0 DOCKER     all      *      br-mailcow  ::/0                 ::/0
    0     0 ACCEPT     all      *      br-mailcow  ::/0                 ::/0                 ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     all      br-mailcow !br-mailcow  ::/0                 ::/0
    0     0 ACCEPT     all      br-mailcow br-mailcow  ::/0                 ::/0

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
# Warning: ip6tables-legacy tables present, use ip6tables-legacy to see them

Logs of iptables -L -vn -t nat:

Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
 296K   18M DOCKER     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
 139K   12M MASQUERADE  all  --  *      !br-mailcow  172.22.1.0/24        0.0.0.0/0
    0     0 MASQUERADE  all  --  *      !docker0  172.17.0.0/16        0.0.0.0/0
    0     0 MASQUERADE  tcp  --  *      *       172.22.1.2           172.22.1.2           tcp dpt:8983
    0     0 MASQUERADE  tcp  --  *      *       172.22.1.249         172.22.1.249         tcp dpt:6379
    0     0 MASQUERADE  tcp  --  *      *       172.22.1.6           172.22.1.6           tcp dpt:3306
    0     0 MASQUERADE  tcp  --  *      *       172.22.1.8           172.22.1.8           tcp dpt:443
    0     0 MASQUERADE  tcp  --  *      *       172.22.1.8           172.22.1.8           tcp dpt:80
    0     0 MASQUERADE  tcp  --  *      *       172.22.1.250         172.22.1.250         tcp dpt:12345
    0     0 MASQUERADE  tcp  --  *      *       172.22.1.250         172.22.1.250         tcp dpt:4190
    0     0 MASQUERADE  tcp  --  *      *       172.22.1.250         172.22.1.250         tcp dpt:995
    0     0 MASQUERADE  tcp  --  *      *       172.22.1.250         172.22.1.250         tcp dpt:993
    0     0 MASQUERADE  tcp  --  *      *       172.22.1.250         172.22.1.250         tcp dpt:143
    0     0 MASQUERADE  tcp  --  *      *       172.22.1.250         172.22.1.250         tcp dpt:110
    0     0 MASQUERADE  tcp  --  *      *       172.22.1.253         172.22.1.253         tcp dpt:588
    0     0 MASQUERADE  tcp  --  *      *       172.22.1.253         172.22.1.253         tcp dpt:587
    0     0 MASQUERADE  tcp  --  *      *       172.22.1.253         172.22.1.253         tcp dpt:465
    0     0 MASQUERADE  tcp  --  *      *       172.22.1.253         172.22.1.253         tcp dpt:25

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DOCKER     all  --  *      *       0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination
 5222  313K RETURN     all  --  br-mailcow *       0.0.0.0/0            0.0.0.0/0
    0     0 RETURN     all  --  docker0 *       0.0.0.0/0            0.0.0.0/0
    0     0 DNAT       tcp  --  !br-mailcow *       0.0.0.0/0            127.0.0.1            tcp dpt:18983 to:172.22.1.2:8983
    0     0 DNAT       tcp  --  !br-mailcow *       0.0.0.0/0            127.0.0.1            tcp dpt:7654 to:172.22.1.249:6379
    0     0 DNAT       tcp  --  !br-mailcow *       0.0.0.0/0            127.0.0.1            tcp dpt:13306 to:172.22.1.6:3306
13796  818K DNAT       tcp  --  !br-mailcow *       0.0.0.0/0            0.0.0.0/0            tcp dpt:443 to:172.22.1.8:443
 2606  152K DNAT       tcp  --  !br-mailcow *       0.0.0.0/0            0.0.0.0/0            tcp dpt:80 to:172.22.1.8:80
    0     0 DNAT       tcp  --  !br-mailcow *       0.0.0.0/0            127.0.0.1            tcp dpt:19991 to:172.22.1.250:12345
    2   120 DNAT       tcp  --  !br-mailcow *       0.0.0.0/0            0.0.0.0/0            tcp dpt:4190 to:172.22.1.250:4190
 4146  226K DNAT       tcp  --  !br-mailcow *       0.0.0.0/0            0.0.0.0/0            tcp dpt:995 to:172.22.1.250:995
13043  784K DNAT       tcp  --  !br-mailcow *       0.0.0.0/0            0.0.0.0/0            tcp dpt:993 to:172.22.1.250:993
 2554  153K DNAT       tcp  --  !br-mailcow *       0.0.0.0/0            0.0.0.0/0            tcp dpt:143 to:172.22.1.250:143
  522 31608 DNAT       tcp  --  !br-mailcow *       0.0.0.0/0            0.0.0.0/0            tcp dpt:110 to:172.22.1.250:110
    6   312 DNAT       tcp  --  !br-mailcow *       0.0.0.0/0            0.0.0.0/0            tcp dpt:588 to:172.22.1.253:588
  702 43216 DNAT       tcp  --  !br-mailcow *       0.0.0.0/0            0.0.0.0/0            tcp dpt:587 to:172.22.1.253:587
 2103  117K DNAT       tcp  --  !br-mailcow *       0.0.0.0/0            0.0.0.0/0            tcp dpt:465 to:172.22.1.253:465
 1749  103K DNAT       tcp  --  !br-mailcow *       0.0.0.0/0            0.0.0.0/0            tcp dpt:25 to:172.22.1.253:25
# Warning: iptables-legacy tables present, use iptables-legacy to see them

Logs of ip6tables -L -vn -t nat:

Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 RETURN     all      br-mailcow *       ::/0                 ::/0

Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DOCKER     all      *      *       ::/0                 ::/0                 ADDRTYPE match dst-type LOCAL

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DOCKER     all      *      *       ::/0                !::1                  ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 MASQUERADE  all      *      br-mailcow  ::/0                 ::/0                 ADDRTYPE match dst-type LOCAL
    0     0 MASQUERADE  all      *      !br-mailcow  fd4d:6169:6c63:6f77::  ::/0

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
# Warning: ip6tables-legacy tables present, use ip6tables-legacy to see them

DNS check:

104.18.22.201
104.18.23.201
DerLinkman commented 1 year ago

@MAGICCC @mkuron Feedback to this approach?

mkuron commented 1 year ago

It doesn't really solve the problem. Now you might need 3000+ domains to cross the limit, but you will still cross it at some point. I'm sure there are configuration options that can be changed to raise the limit. though unfortunately @awsumco's log messages are very unspecific so it's hard to know which ones. On https://doc.dovecot.org/settings/core/, I found config_cache_size (Default: 1M, The maximum size of the in-memory configuration cache. The cache should be large enough to allow keeping the full, parsed Dovecot configuration in memory. The default is almost always large enough, unless your system has numerous large TLS certificates in the configuration.) and default_vsz_limit (Default: 256M, The default virtual memory size limit for service processes.). I am pretty sure both of these should be tuned on large Mailcow installs, perhaps dynamically depending on the number of SNI domains.

awsumco commented 1 year ago

I will look into config_cache_size (this makes sense to me) and get back with more information here. Also I agree the log was not very helpful at all and increasing verbosity did not reveal any more info. I only stumbled accost the sni.conf while looking at at other dovecot server setup with a huge amount of domains hosted, made changes and to my surprise the service started.

As for reading the same error at some stage yes I agree it will happen, however I have picked up other scaling problems with MailCow which I am happy to share if anyone is interested. In saying that I will set a max domains limit on the setup.

I suppose you could look at this contribution as a neater way to setup the sni.conf file.

FreddleSpl0it commented 4 days ago

I replicated the issue with 2000 Domains and Dovecot says

Nov 20 12:03:01 cae2cdf0b28b dovecot: config: Fatal: master: service(config): child 2358 returned error 83 (Out of memory (service config { vsz_limit=1024 MB }, you may need to increase it) - set CORE_OUTOFMEM=1 environment to get core dump)

In dovecot.conf i've added vsz_limit = 2048 MB to service config

service config {
  vsz_limit = 2048 MB
  unix_listener config {
    user = root
    group = vmail
    mode = 0660
  }
}

The out of memory log is gone but the config service now takes to long to return the dovecot configuration which results in

Nov 20 12:14:47 cae2cdf0b28b dovecot: managesieve-login: Error: conn unix:stats-writer (pid=2162,uid=0): Timeout waiting for handshake response
Nov 20 12:14:47 cae2cdf0b28b dovecot: managesieve-login: Error: conn unix:stats-writer (pid=2162,uid=0): Timeout waiting for handshake response
Nov 20 12:14:47 cae2cdf0b28b dovecot: replicator: Error: conn unix:/run/dovecot/stats-writer (pid=2162,uid=0): Timeout waiting for handshake response
Nov 20 12:14:52 cae2cdf0b28b dovecot: anvil: Fatal: Error reading configuration: read(/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)
Nov 20 12:14:52 cae2cdf0b28b dovecot: master: Error: service(anvil): command startup failed, throttling for 2.000 secs
Nov 20 12:14:52 cae2cdf0b28b dovecot: stats: Fatal: Error reading configuration: read(/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)
Nov 20 12:14:52 cae2cdf0b28b dovecot: master: Error: service(stats): command startup failed, throttling for 2.000 secs
Nov 20 12:14:57 cae2cdf0b28b dovecot: managesieve-login: Fatal: Error reading configuration: read(/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)
Nov 20 12:14:57 cae2cdf0b28b dovecot: master: Error: service(managesieve-login): command startup failed, throttling for 2.000 secs
Nov 20 12:14:57 cae2cdf0b28b dovecot: managesieve-login: Fatal: Error reading configuration: read(/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)
Nov 20 12:14:57 cae2cdf0b28b dovecot: replicator: Fatal: Error reading configuration: read(/run/dovecot/config) failed: read(size=8192) failed: Interrupted system call - Also failed to read config by executing doveconf: /run/dovecot/config is a UNIX socket (path is from CONFIG_FILE environment)

@mkuron i think in such big setups we should offload ssl termination to nginx.

mkuron commented 4 days ago

i think in such big setups we should offload ssl termination to nginx.

Good idea.

FreddleSpl0it commented 4 days ago

Should we use nginx as a TCP or IMAP proxy to offload ssl termination? I'm not quite sure if there are any downsides to using nginx as an IMAP proxy.

mkuron commented 3 days ago

Should we use nginx as a TCP or IMAP proxy to offload ssl termination?

It would have to be an IMAP proxy so the remote IP address can be passed through.

I'm not quite sure if there are any downsides to using nginx as an IMAP proxy.

I could imagine it having load issues with large numbers of long-lived connections, e.g. for IMAP IDLE. But we would have to test it to find out if that is even relevant when compared to the load on Dovecot.