mailcow / mailcow-dockerized

mailcow: dockerized - 🐮 + 🐋 = 💕
https://mailcow.email
GNU General Public License v3.0
8.75k stars 1.17k forks source link

If it is not currently possible to control the time of day that certificate renewals are attempted could this be aded as a feature? #5627

Closed chriscroome closed 5 months ago

chriscroome commented 9 months ago

Contribution guidelines

I've found a bug and checked that ...

Description

One of our Mailcow servers is fairly large, 64GB RAM, 12 CPU cores, 1.5TB disk, 327 domains and 1,674 mailboxes.

Almost every day the Postfix, Dovecot and Ngnix containers are restarted twice in quick succession and as a result IMAP and SMTP services are unavailable for around 5 minutes, during this time clients cannot send or receive email, when the containers are back up and running we get a Subject: Watchdog ALERT: certcheck email — I'm convinced that this downtime is related to certificate renewals.

The issue is not so much the annoying downtime, (it is however very annoying) but rather that all attempts to get the renewals to take place in the middle of the night UK time have failed (I tried force renewing all certs at 3am for example) — they always seem to take place during office hours and as a result we very often get complaints from users that their email is not working during the outage.

We currently have a cron job to renew certs monthly:

# Example of job definition:
# .---------------- minute (0 - 59)
# |  .------------- hour (0 - 23)
# |  |  .---------- day of month (1 - 31)
# |  |  |  .------- month (1 - 12) OR jan,feb,mar,apr ...
# |  |  |  |  .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
# |  |  |  |  |
# *  *  *  *  * user-name command to be executed
# renew all certs at 11pm on the 10th of each month
1 23 10 * * /usr/local/bin/mailcow_cert.sh

The mailcow_cert.sh script:

#!/usr/bin/env bash

set -Eeuo pipefail

mailcow_path=/var/mailcow
acme_c=$(docker ps -qaf name=acme-mailcow)
postfix_c=$(docker ps -qaf name=postfix-mailcow)
dovecot_c=$(docker ps -qaf name=dovecot-mailcow)
nginx_c=$(docker ps -qaf name=nginx-mailcow)

if [[ -d "${mailcow_path}" ]]
then
  # https://docs.mailcow.email/post_installation/firststeps-ssl/#force-renewal
  cd "${mailcow_path}"
  touch data/assets/ssl/force_renew
  echo "Restarting ACME container ${acme_c}"
  docker restart "${acme_c}"
  # echo "Sleeping for 120 seconds"
  # sleep 120
  # echo "Restarting Postfix, Dovecot and Nginx containers, ${postfix_c}, ${dovecot_c} and ${nginx_c}"
  # docker restart "${postfix_c}" "${dovecot_c}" "${nginx_c}"
elif [[ -L "${mailcow_path}" ]]
then
  echo "The mailcow_path, ${mailcow_path} is a symlink, please update ${0} to use the Mailcow directory"
  exit 1
else
  echo "The mailcow_path, ${mailcow_path} is not a directory or a symlink, please update ${0} to use the Mailcow directory"
fi

This issue has been ongoing for over a year, we have a support license for this server and I did raise a Servercow ticket for this but that didn't result in a solution.

If there is currently no way to control the time of day that certificate renewals are done can this be considered as a feature request?

Logs:

Example Watchdog email:

Date: Thu, 4 Jan 2024 10:37:42 +0000                                                                                                                                                                                                                                                                                         
From: watchdog@mail.webarch.email                                                                                                                                                                                                                                                                                            
To: chris@webarchitects.co.uk                                                                                                                                                                                                                                                                                                
Subject: Watchdog ALERT: certcheck                                                                                                                                                                                                                                                                                           
X-Mailer: smtp-cli 3.10, see http://smtp-cli.logix.cz                                                                                                                                                                                                                                                                        

IMAP OK - 0.036 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.036262s;;;0.000000;10.000000
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
IMAP OK - 0.037 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.036616s;;;0.000000;10.000000
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
IMAP OK - 0.033 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.032927s;;;0.000000;10.000000
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
IMAP OK - 0.040 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.039808s;;;0.000000;10.000000
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
IMAP OK - 0.039 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.039482s;;;0.000000;10.000000
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
IMAP OK - 0.041 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.040824s;;;0.000000;10.000000
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
IMAP OK - 0.040 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.040111s;;;0.000000;10.000000
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
IMAP OK - 0.037 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.036815s;;;0.000000;10.000000
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
IMAP OK - 0.042 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.041765s;;;0.000000;10.000000
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
IMAP OK - 0.040 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.040437s;;;0.000000;10.000000
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
IMAP OK - 0.047 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.046810s;;;0.000000;10.000000
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
IMAP OK - 0.039 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.039251s;;;0.000000;10.000000
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
IMAP OK - 0.041 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.040719s;;;0.000000;10.000000
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
IMAP OK - 0.037 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.036690s;;;0.000000;10.000000
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
IMAP OK - 0.042 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.042132s;;;0.000000;10.000000
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
IMAP OK - 0.028 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.027768s;;;0.000000;10.000000
SMTP UNKNOWN - Cannot read EHLO response via TLS.
connect to address 172.22.1.250 and port 993: Connection refused
SMTP UNKNOWN - Cannot read EHLO response via TLS.
connect to address 172.22.1.250 and port 993: Connection refused
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
SSL OK - Certificate 'mail.webarch.email' will expire in 65 days on 2024-03-09 22:06 +0000/GMT.
IMAP OK - 0.028 second response time on 172.22.1.250 port 993 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ AUTH=PLAIN AUTH=LOGIN] Dovecot ready.]|time=0.028456s;;;0.000000;10.000000

Example docker ps results illustrating the uptime of the containers:

docker ps
CONTAINER ID   IMAGE                    COMMAND                  CREATED      STATUS                PORTS                                                                                                                                                                                                                               NAMES
f339ec009ae5   mailcow/watchdog:2.00    "/bin/sh -c /watchdo…"   5 days ago   Up 5 days                                                                                                                                                                                                                                                 mailcow-watchdog-mailcow-1
e6188b782907   mcuadros/ofelia:latest   "/usr/bin/ofelia dae…"   5 days ago   Up 5 days                                                                                                                                                                                                                                                 mailcow-ofelia-mailcow-1
7dbc8b1374da   mailcow/rspamd:1.94      "/docker-entrypoint.…"   5 days ago   Up 5 days                                                                                                                                                                                                                                                 mailcow-rspamd-mailcow-1
640382fec083   mailcow/netfilter:1.54   "/bin/sh -c /app/doc…"   5 days ago   Up 4 days                                                                                                                                                                                                                                                 mailcow-netfilter-mailcow-1
c00bd541d31f   mailcow/acme:1.85        "/sbin/tini -g -- /s…"   5 days ago   Up 5 days                                                                                                                                                                                                                                                 mailcow-acme-mailcow-1
a2dac1c354f6   mailcow/dovecot:1.26     "/docker-entrypoint.…"   5 days ago   Up 37 minutes         0.0.0.0:110->110/tcp, :::110->110/tcp, 0.0.0.0:143->143/tcp, :::143->143/tcp, 0.0.0.0:993->993/tcp, :::993->993/tcp, 0.0.0.0:995->995/tcp, :::995->995/tcp, 0.0.0.0:4190->4190/tcp, :::4190->4190/tcp, 127.0.0.1:19991->12345/tcp   mailcow-dovecot-mailcow-1
48b5daff08f8   mailcow/postfix:1.73     "/docker-entrypoint.…"   5 days ago   Up 37 minutes         0.0.0.0:25->25/tcp, :::25->25/tcp, 0.0.0.0:465->465/tcp, :::465->465/tcp, 0.0.0.0:587->587/tcp, :::587->587/tcp, 588/tcp                                                                                                            mailcow-postfix-mailcow-1
febdc44c5f73   nginx:mainline-alpine    "/docker-entrypoint.…"   5 days ago   Up 37 minutes         0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp                                                                                                                                                            mailcow-nginx-mailcow-1
79c40dd8d43a   mariadb:10.5             "docker-entrypoint.s…"   5 days ago   Up 5 days             127.0.0.1:13306->3306/tcp                                                                                                                                                                                                           mailcow-mysql-mailcow-1
0824068a296c   mailcow/clamd:1.63       "/sbin/tini -g -- /c…"   5 days ago   Up 5 days (healthy)   3310/tcp, 7357/tcp                                                                                                                                                                                                                  mailcow-clamd-mailcow-1
82b2e91c57f2   mailcow/phpfpm:1.85      "/docker-entrypoint.…"   5 days ago   Up 5 days             9000/tcp                                                                                                                                                                                                                            mailcow-php-fpm-mailcow-1
d4c369a2fe90   redis:7-alpine           "docker-entrypoint.s…"   5 days ago   Up 5 days             127.0.0.1:7654->6379/tcp                                                                                                                                                                                                            mailcow-redis-mailcow-1
0e3235cc6da6   mailcow/solr:1.8.1       "docker-entrypoint.s…"   5 days ago   Up 5 days             127.0.0.1:18983->8983/tcp                                                                                                                                                                                                           mailcow-solr-mailcow-1
5f84b2b8c27a   mailcow/sogo:1.120       "/docker-entrypoint.…"   5 days ago   Up 5 days                                                                                                                                                                                                                                                 mailcow-sogo-mailcow-1
c811688077f9   mailcow/dockerapi:2.06   "/bin/sh /app/docker…"   5 days ago   Up 5 days                                                                                                                                                                                                                                                 mailcow-dockerapi-mailcow-1
621a26055968   memcached:alpine         "docker-entrypoint.s…"   5 days ago   Up 5 days             11211/tcp                                                                                                                                                                                                                           mailcow-memcached-mailcow-1
b798b1c18909   mailcow/olefy:1.11       "python3 -u /app/ole…"   5 days ago   Up 5 days                                                                                                                                                                                                                                                 mailcow-olefy-mailcow-1
531567b7b155   mailcow/unbound:1.18     "/docker-entrypoint.…"   5 days ago   Up 5 days (healthy)   53/tcp, 53/udp                                                                                                                                                                                                                      mailcow-unbound-mailcow-1

We have logs going to /var/log/syslog and I'd be happy to grep / zgrep them if anyome has suggestion for patterns — they are too big to post here!

ls -laht /var/log/syslog*
-rw-r----- 1 root adm 127M Jan  4 11:10 /var/log/syslog
-rw-r----- 1 root adm 248M Jan  4 00:00 /var/log/syslog.1
-rw-r----- 1 root adm  30M Jan  3 00:00 /var/log/syslog.2.gz
-rw-r----- 1 root adm  19M Jan  2 00:00 /var/log/syslog.3.gz
-rw-r----- 1 root adm  21M Jan  1 00:00 /var/log/syslog.4.gz
-rw-r----- 1 root adm  23M Dec 31 00:00 /var/log/syslog.5.gz
-rw-r----- 1 root adm  21M Dec 30 00:00 /var/log/syslog.6.gz
-rw-r----- 1 root adm  23M Dec 29 00:00 /var/log/syslog.7.gz
-rw-r----- 1 root adm  21M Dec 28 00:00 /var/log/syslog.8.gz
-rw-r----- 1 root adm  18M Dec 27 00:00 /var/log/syslog.9.gz
-rw-r----- 1 root adm  17M Dec 26 00:00 /var/log/syslog.10.gz
-rw-r----- 1 root adm  26M Dec 25 00:00 /var/log/syslog.11.gz
-rw-r----- 1 root adm  21M Dec 24 00:00 /var/log/syslog.12.gz
-rw-r----- 1 root adm  26M Dec 23 00:00 /var/log/syslog.13.gz
-rw-r----- 1 root adm  29M Dec 22 00:00 /var/log/syslog.14.gz
-rw-r----- 1 root adm  27M Dec 21 00:00 /var/log/syslog.15.gz
-rw-r----- 1 root adm  31M Dec 20 00:00 /var/log/syslog.16.gz
-rw-r----- 1 root adm  31M Dec 19 00:00 /var/log/syslog.17.gz
-rw-r----- 1 root adm  23M Dec 18 00:00 /var/log/syslog.18.gz
-rw-r----- 1 root adm  25M Dec 17 00:00 /var/log/syslog.19.gz
-rw-r----- 1 root adm  28M Dec 16 00:00 /var/log/syslog.20.gz
-rw-r----- 1 root adm  29M Dec 15 00:00 /var/log/syslog.21.gz
-rw-r----- 1 root adm  32M Dec 14 00:00 /var/log/syslog.22.gz
-rw-r----- 1 root adm  33M Dec 13 00:00 /var/log/syslog.23.gz
-rw-r----- 1 root adm  33M Dec 12 00:00 /var/log/syslog.24.gz
-rw-r----- 1 root adm  25M Dec 11 00:00 /var/log/syslog.25.gz
-rw-r----- 1 root adm  24M Dec 10 00:00 /var/log/syslog.26.gz
-rw-r----- 1 root adm  32M Dec  9 00:00 /var/log/syslog.27.gz
-rw-r----- 1 root adm  31M Dec  8 00:00 /var/log/syslog.28.gz
-rw-r----- 1 root adm  29M Dec  7 00:00 /var/log/syslog.29.gz
-rw-r----- 1 root adm  29M Dec  6 00:00 /var/log/syslog.30.gz

Steps to reproduce:

Nothing need to be done to reproduce this, it happens on it's own almost every day.

Which branch are you using?

master

Operating System:

Debian Bullseye 11.8

Server/VM specifications:

64GB RAM, 12 CPU cores, 1.5TB disk

Is Apparmor, SELinux or similar active?

No

Virtualization technology:

Xen

Docker version:

24.0.7

docker-compose version or docker compose version:

v2.21.0

mailcow version:

2023-12a

Reverse proxy:

None

Logs of git diff:

Omitted as it triggered the "There was an error creating your issue: body is too long, body is too long (maximum is 65536 characters)." warning from GitHub that prevented this form being submitted.

Logs of iptables -L -vn:

Omitted as it triggered the "There was an error creating your issue: body is too long, body is too long (maximum is 65536 characters)." warning from GitHub that prevented this form being submitted.

Logs of ip6tables -L -vn:

Omitted as it triggered the "There was an error creating your issue: body is too long, body is too long (maximum is 65536 characters)." warning from GitHub that prevented this form being submitted.

Logs of iptables -L -vn -t nat:

Omitted as it triggered the "There was an error creating your issue: body is too long, body is too long (maximum is 65536 characters)." warning from GitHub that prevented this form being submitted.

Logs of ip6tables -L -vn -t nat:

Omitted as it triggered the "There was an error creating your issue: body is too long, body is too long (maximum is 65536 characters)." warning from GitHub that prevented this form being submitted.

DNS check:

104.18.32.7
172.64.155.249
awsumco commented 8 months ago

My 2 cents here, a suggestion would be to add an environment variable such as ACME_INTERVAL which would be used in acme.sh. In the sleep 1d line at the end of the function.

IE sleep $ACME_INTERVAL

in the docker-compose.yml file under mallow-amce service something like:

${ACME_INTERVAL:-1d}

milkmaker commented 6 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

chriscroome commented 5 months ago

Please re-open this issue as it has not been addressed or solved.