Closed kklepper closed 4 years ago
root@IONOS_1: /root/docker-mailserver # docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5dff312a223b jeboehm/mailserver-filter:latest "/usr/local/bin/entr…" About an hour ago Exited (1) About an hour ago docker-mailserver_filter_1
7e7316042933 jeboehm/mailserver-mda:latest "/usr/local/bin/entr…" About an hour ago Up About an hour (unhealthy) 0.0.0.0:110->110/tcp, 0.0.0.0:143->143/tcp, 0.0.0.0:587->587/tcp, 0.0.0.0:993->993/tcp, 2003/tcp, 0.0.0.0:995->995/tcp, 4190/tcp docker-mailserver_mda_1
755f2feb8785 jeboehm/mailserver-db:latest "docker-entrypoint.s…" About an hour ago Up About an hour 3306/tcp, 33060/tcp docker-mailserver_db_1
3b13d1ad44d7 jeboehm/mailserver-mta:latest "/usr/local/bin/entr…" About an hour ago Exited (1) About an hour ago docker-mailserver_mta_1
938dd60775c3 jeboehm/mailserver-ssl:latest "/usr/local/bin/crea…" About an hour ago Exited (0) About an hour ago docker-mailserver_ssl_1
8ac50ab6c8dd jeboehm/mailserver-virus:latest "/usr/local/bin/entr…" About an hour ago Up About an hour (healthy) 3310/tcp docker-mailserver_virus_1
4de3bc5b150e jeboehm/mailserver-web:latest "docker-php-entrypoi…" About an hour ago Exited (1) About an hour ago docker-mailserver_web_1
Hmm, don't you consider that the API version issue might be the cause for these kind of problems?
If not: check the logs of the stopped containers.
Edit: I just moved from travis-ci to github actions. All integration tests are still running fine.
Thank you. Ok, never did travis, something to learn.
How could the API version interfere here? I noticed you updated the images a week ago. So I work with new images now. How can I test with the old images?
Will check the logs ASAP.
Travis is nothing you need to learn for now, but it runs docker-mailserver on a clean system on every code change and tests it in various ways. For me this is an additional sign that you have something broken in your system.
downgrading
curl -L "https://github.com/docker/compose/releases/download/1.23.1/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
docker-compose --version
root@IONOS_2: /root # docker info
Client:
Debug Mode: false
Server:
Containers: 24
Running: 20
Paused: 0
Stopped: 4
Images: 49
Server Version: 19.03.8
Storage Driver: overlay2
Backing Filesystem: <unknown>
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
NodeID: re9hm1njacdpdz8wxdlztyred
Is Manager: true
ClusterID: nezb1aljkqly70j2v7ln705ku
Managers: 1
Nodes: 1
Default Address Pool: 10.0.0.0/8
SubnetSize: 24
Data Path Port: 4789
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 10
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 0
Autolock Managers: false
Root Rotation In Progress: false
Node Address: 217.160.241.84
Manager Addresses:
217.160.241.84:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 894b81a4b802e4eb2a91d1ce216b8817763c29fb
runc version: 425e105d5a03fabd737a126ad93d62a9eeede87f
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 4.18.0-147.8.1.el8_1.x86_64
Operating System: CentOS Linux 8 (Core)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.607GiB
Name: mail.xxx.tld
ID: U6R6:LG75:3V7W:TEZD:DHBC:WLRJ:TYHO:YIWZ:RTKE:CWQB:SYLJ:WCKC
Docker Root Dir: /var/lib/docker
Debug Mode: false
Username: kklepper
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
root@IONOS_2: /root # docker version
Client: Docker Engine - Community
Version: 19.03.8
API version: 1.40
Go version: go1.12.17
Git commit: afacb8b
Built: Wed Mar 11 01:27:04 2020
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.8
API version: 1.40 (minimum version 1.12)
Go version: go1.12.17
Git commit: afacb8b
Built: Wed Mar 11 01:25:42 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.2.6
GitCommit: 894b81a4b802e4eb2a91d1ce216b8817763c29fb
runc:
Version: 1.0.0-rc8
GitCommit: 425e105d5a03fabd737a126ad93d62a9eeede87f
docker-init:
Version: 0.18.0
GitCommit: fec3683
root@IONOS_2: /root # docker-compose version
docker-compose version 1.23.1, build b02f1306
docker-py version: 3.5.0
CPython version: 3.6.7
OpenSSL version: OpenSSL 1.1.0f 25 May 2017
Run bin/production.sh up -d to start the services.
root@IONOS_2: /root/docker-mailserver # docker-compose up -d
WARNING: The Docker Engine you're using is running in swarm mode.
Compose does not use swarm mode to deploy services to multiple nodes in a swarm. All containers will be scheduled on the current node.
To deploy your application across the swarm, use `docker stack deploy`.
Creating network "docker-mailserver_default" with the default driver
Creating docker-mailserver_mda_1_1614147f041b ... done
Creating docker-mailserver_db_1_bcc9cdc42428 ... done
Creating docker-mailserver_mta_1_66c8365f59a4 ... done
Creating docker-mailserver_ssl_1_cc1f9d511145 ... done
Creating docker-mailserver_web_1_8c74292ebec2 ... done
Creating docker-mailserver_virus_1_3215258e5588 ... done
Creating docker-mailserver_filter_1_5e3147d5fc67 ... done
After a few seconds you can access the services listed in the paragraph Services.
root@IONOS_2: /root/docker-mailserver # docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f84fb2ba9f6c jeboehm/mailserver-filter:latest "/usr/local/bin/entr…" About a minute ago Up About a minute (health: starting) 11332/tcp, 11334/tcp docker-mailserver_filter_1_d0b90b73c09b
49f814bd7040 jeboehm/mailserver-web:latest "docker-php-entrypoi…" About a minute ago Up About a minute (health: starting) 80/tcp, 9000/tcp docker-mailserver_web_1_190dbdfd2994
791859b5d89f jeboehm/mailserver-virus:latest "/usr/local/bin/entr…" About a minute ago Up About a minute (health: starting) 3310/tcp docker-mailserver_virus_1_5b9bb86c6bb5
8dca8ef071ba jeboehm/mailserver-ssl:latest "/usr/local/bin/crea…" About a minute ago Exited (0) About a minute ago docker-mailserver_ssl_1_b02abc5b401d
edcc5559eac6 jeboehm/mailserver-mta:latest "/usr/local/bin/entr…" About a minute ago Up About a minute (health: starting) 25/tcp docker-mailserver_mta_1_46ca460e02e5
8db04846c1b0 jeboehm/mailserver-db:latest "docker-entrypoint.s…" About a minute ago Up About a minute 3306/tcp, 33060/tcp docker-mailserver_db_1_9e4142711371
ea61fd97b11c jeboehm/mailserver-mda:latest "/usr/local/bin/entr…" About a minute ago Up About a minute (health: starting) 110/tcp, 143/tcp, 993/tcp, 995/tcp, 2003/tcp, 4190/tcp docker-mailserver_mda_1_cdfae885f1a0
root@IONOS_2: /root/docker-mailserver # curl -v localhost:81
* Rebuilt URL to: localhost:81/
* Trying 127.0.0.1...
* TCP_NODELAY set
* connect to 127.0.0.1 port 81 failed: Connection refused
* Trying ::1...
* TCP_NODELAY set
* connect to ::1 port 81 failed: Connection refused
* Failed to connect to localhost port 81: Connection refused
* Closing connection 0
curl: (7) Failed to connect to localhost port 81: Connection refused
root@IONOS_2: /root/docker-mailserver # docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f84fb2ba9f6c jeboehm/mailserver-filter:latest "/usr/local/bin/entr…" 2 minutes ago Up 2 minutes (healthy) 11332/tcp, 11334/tcp docker-mailserver_filter_1_d0b90b73c09b
49f814bd7040 jeboehm/mailserver-web:latest "docker-php-entrypoi…" 2 minutes ago Exited (1) 29 seconds ago docker-mailserver_web_1_190dbdfd2994
791859b5d89f jeboehm/mailserver-virus:latest "/usr/local/bin/entr…" 2 minutes ago Up 2 minutes (healthy) 3310/tcp docker-mailserver_virus_1_5b9bb86c6bb5
8dca8ef071ba jeboehm/mailserver-ssl:latest "/usr/local/bin/crea…" 2 minutes ago Exited (0) 2 minutes ago docker-mailserver_ssl_1_b02abc5b401d
edcc5559eac6 jeboehm/mailserver-mta:latest "/usr/local/bin/entr…" 2 minutes ago Up 2 minutes (healthy) 25/tcp docker-mailserver_mta_1_46ca460e02e5
8db04846c1b0 jeboehm/mailserver-db:latest "docker-entrypoint.s…" 2 minutes ago Up 2 minutes 3306/tcp, 33060/tcp docker-mailserver_db_1_9e4142711371
ea61fd97b11c jeboehm/mailserver-mda:latest "/usr/local/bin/entr…" 2 minutes ago Up 2 minutes (healthy) 110/tcp, 143/tcp, 993/tcp, 995/tcp, 2003/tcp, 4190/tcp docker-mailserver_mda_1_cdfae885f1a0
49f814bd7040 ... jeboehm/mailserver-web:latest ... 2 minutes ago ... Exited (1) 29 seconds ago
root@IONOS_2: /root/docker-mailserver # docker logs 49f814bd7040
2020/05/29 14:55:53 Waiting for: tcp://db:3306
2020/05/29 14:55:53 Waiting for: tcp://mda:143
2020/05/29 14:55:53 Waiting for: tcp://mta:25
2020/05/29 14:55:53 Waiting for: tcp://filter:11334
2020/05/29 14:55:53 Waiting for: file:///media/dkim/
2020/05/29 14:55:53 Problem with dial: dial tcp 217.160.241.84:11334: getsockopt: connection refused. Sleeping 1s
2020/05/29 14:55:53 Problem with dial: dial tcp 217.160.241.84:143: getsockopt: connection refused. Sleeping 1s
2020/05/29 14:55:53 Problem with dial: dial tcp 217.160.241.84:3306: getsockopt: connection refused. Sleeping 1s
2020/05/29 14:55:53 Problem with dial: dial tcp 217.160.241.84:25: getsockopt: connection refused. Sleeping 1s
2020/05/29 14:55:54 File file:///media/dkim/ had been generated
2020/05/29 14:55:54 Problem with dial: dial tcp 217.160.241.84:11334: getsockopt: connection refused. Sleeping 1s
2020/05/29 14:55:54 Problem with dial: dial tcp 217.160.241.84:3306: getsockopt: connection refused. Sleeping 1s
[...=============================================================================================================...]
2020/05/29 14:57:53 Problem with dial: dial tcp 217.160.241.84:143: getsockopt: connection refused. Sleeping 1s
2020/05/29 14:57:53 Problem with dial: dial tcp 217.160.241.84:25: getsockopt: connection refused. Sleeping 1s
2020/05/29 14:57:53 Problem with dial: dial tcp 217.160.241.84:11334: getsockopt: connection refused. Sleeping 1s
2020/05/29 14:57:53 Timeout after 2m0s waiting on dependencies to become available: [tcp://db:3306 tcp://mda:143 tcp://mta:25 tcp://filter:11334 file:///media/dkim/]
Hello,
maybe I found something.
To begin with, I made sure that no container is running. Then I started mailserver with
root@IONOS_1: /root/docker-mailserver # bin/production.sh up
When all containers were running, error messages were produced abundantly. There are 3 containers producing these error messages:
filter_1 | 2020/06/02 06:57:38 Problem with dial: dial tcp 192.168.208.2:3310: getsockopt: connection refused. Sleeping 1s
web_1 | 2020/06/02 06:57:38 Problem with dial: dial tcp 192.168.208.5:25: getsockopt: connection refused. Sleeping 1s
web_1 | 2020/06/02 06:57:38 Problem with dial: dial tcp 192.168.208.4:11334: getsockopt: connection refused. Sleeping 1s
mta_1 | 2020/06/02 06:57:38 Problem with dial: dial tcp 192.168.208.4:11332: getsockopt: connection refused. Sleeping 1s
They keep repeating over and over until the corresponding container gives up.
Now what does this mean? I interpret these error messages as protocols of attempting to spawn a contact between this container and another one. For example, filter_1
tries to connect to the container running on IP 192.168.208.2
having port 3310
open.
If I remember correctly, this was the way to talk to containers in the stone age of docker. Since several years now, containers talk to each other by their name. Why do we have this problem in the first place? Who is trying to contact another container the old way?
In order to understand this error message better, I wrote a little shell script. One question here is, for example, which container runs on IP 192.168.208.2
? You can find out via cat /etc/hosts
in all containers and look if you can find this IP somewhere. It is db
.
The next question is, can filter_1
see db
and vice versa? To answer this question, traditionally ping
is used.
db
does not know about ping
, so in order to make my shell script run, ping
has to be installed.
For the record:
root # docker ps -a | grep "db"
8e5204399c13 jeboehm/mailserver-db:latest "docker-entrypoint.s…" 23 hours ago Up 23 hours 3306/tcp, 33060/tcp docker-mailserver_200527_db_1
root@IONOS_1: /root # id=8e5204399c13
root@IONOS_1: /root # docker exec -it $id bash
root@8e5204399c13:/# apt-get update
Get:1 http://repo.mysql.com/apt/debian buster InRelease [21.5 kB]
Get:2 http://deb.debian.org/debian buster InRelease [121 kB]
Get:3 http://security.debian.org/debian-security buster/updates InRelease [65.4 kB]
Get:4 http://deb.debian.org/debian buster-updates InRelease [49.3 kB]
Get:5 http://repo.mysql.com/apt/debian buster/mysql-5.7 amd64 Packages [5685 B]
Get:6 http://deb.debian.org/debian buster/main amd64 Packages [7905 kB]
Get:7 http://deb.debian.org/debian buster-updates/main amd64 Packages [7380 B]
Get:8 http://security.debian.org/debian-security buster/updates/main amd64 Packages [201 kB]
Fetched 8377 kB in 2s (3574 kB/s)
Reading package lists... Done
root@8e5204399c13:/# apt-get install iputils-ping -y
[...]
filter_1
is no good example either, as ping: permission denied (are you root?)
gave me a problem I did not know to solve. So I turned to web_1
first to make sure I'm on the right track and then wrote this script check_containers.sh
to automate this investigation:
#!/bin/sh
services="db mta mda web filter virus"
check_service(){
id=$(docker ps -a | grep "$1" | awk '{print $1}')
b_cmd="db"
cmd=ash
case $1 in
db) cmd=bash;;
esac
ip=$(docker exec -it $id $cmd -c 'cat /etc/hosts | grep "192"')
echo "$1 $ip"
for s in $services
do
check_ping $1 $s $cmd $id
done
}
check_ping(){
if [ $1 == $s ]
then
return
fi
echo "$1 => $s"
docker exec -it $id $cmd -c "ping -c 1 $s | grep from"
}
#check just this single service: web
#check_service web
#exit;
#check all services
for s in $services
do
echo $s
check_service $s
done
If I call this script soon enough, all containers are still running. The result is:
root@IONOS_1: /root/docker-mailserver # ./check_containers.sh
db
db 192.168.208.2 0b5282addc10
db => mta
bash: ping: command not found
db => mda
bash: ping: command not found
db => web
bash: ping: command not found
db => filter
bash: ping: command not found
db => virus
bash: ping: command not found
mta
mta 192.168.208.5 bb604c738034
mta => db
64 bytes from 192.168.208.2: seq=0 ttl=64 time=0.114 ms
mta => mda
64 bytes from 192.168.208.6: seq=0 ttl=64 time=0.095 ms
mta => web
64 bytes from 192.168.208.7: seq=0 ttl=64 time=0.103 ms
mta => filter
64 bytes from 192.168.208.3: seq=0 ttl=64 time=0.099 ms
mta => virus
64 bytes from 192.168.208.4: seq=0 ttl=64 time=0.150 ms
mda
mda 192.168.208.6 c0322a4398e4
mda => db
64 bytes from 192.168.208.2: seq=0 ttl=64 time=0.128 ms
mda => mta
64 bytes from 192.168.208.5: seq=0 ttl=64 time=0.104 ms
mda => web
64 bytes from 192.168.208.7: seq=0 ttl=64 time=0.096 ms
mda => filter
64 bytes from 192.168.208.3: seq=0 ttl=64 time=0.152 ms
mda => virus
64 bytes from 192.168.208.4: seq=0 ttl=64 time=0.141 ms
web
web 192.168.208.7 640c2397cfc1
web => db
64 bytes from 192.168.208.2: seq=0 ttl=64 time=0.113 ms
web => mta
64 bytes from 192.168.208.5: seq=0 ttl=64 time=0.150 ms
web => mda
64 bytes from 192.168.208.6: seq=0 ttl=64 time=0.098 ms
web => filter
64 bytes from 192.168.208.3: seq=0 ttl=64 time=0.116 ms
web => virus
64 bytes from 192.168.208.4: seq=0 ttl=64 time=0.175 ms
filter
filter 192.168.208.3 dd0ce897e631
filter => db
ping: permission denied (are you root?)
filter => mta
ping: permission denied (are you root?)
filter => mda
ping: permission denied (are you root?)
filter => web
ping: permission denied (are you root?)
filter => virus
ping: permission denied (are you root?)
virus
virus 192.168.208.4 b822ab9e0a92
virus => db
ping: permission denied (are you root?)
virus => mta
ping: permission denied (are you root?)
virus => mda
ping: permission denied (are you root?)
virus => web
ping: permission denied (are you root?)
virus => filter
ping: permission denied (are you root?)
root@IONOS_1: /root/docker-mailserver #
After installing ping
in db
, we get positive results here as well:
db
db 192.168.208.2 0b5282addc10
db => mta
64 bytes from docker-mailserver_mta_1.docker-mailserver_default (192.168.208.5): icmp_seq=1 ttl=64 time=0.108 ms
db => mda
64 bytes from docker-mailserver_mda_1.docker-mailserver_default (192.168.208.6): icmp_seq=1 ttl=64 time=0.102 ms
db => web
64 bytes from docker-mailserver_web_1.docker-mailserver_default (192.168.208.7): icmp_seq=1 ttl=64 time=0.097 ms
db => filter
64 bytes from docker-mailserver_filter_1.docker-mailserver_default (192.168.208.3): icmp_seq=1 ttl=64 time=0.136 ms
db => virus
64 bytes from docker-mailserver_virus_1.docker-mailserver_default (192.168.208.4): icmp_seq=1 ttl=64 time=0.150 ms
This shows that all containers can see each other, assuming the ping: permission denied (are you root?)
problem would be solved. The question remains why the contact is constructed in a way that must fail.
Please tell me if I am right in my analysis and what I can do to fix these things.
Each service is checking the availablility of other services before starting, thats right. This is done via the application Dockerize. Dockerize is called by the /usr/local/bin/entrypoint.sh of each container. As you can see there, name resolution is done unless you change the corresponding variables. This is also shown by your log output 4 days ago: https://github.com/jeboehm/docker-mailserver/issues/85#issuecomment-636025437
2020/05/29 14:55:53 Waiting for: tcp://db:3306
2020/05/29 14:55:53 Waiting for: tcp://mda:143
2020/05/29 14:55:53 Waiting for: tcp://mta:25
2020/05/29 14:55:53 Waiting for: tcp://filter:11334
2020/05/29 14:55:53 Problem with dial: dial tcp 217.160.241.84:11334: getsockopt: connection refused. Sleeping 1s
2020/05/29 14:55:53 Problem with dial: dial tcp 217.160.241.84:143: getsockopt: connection refused. Sleeping 1s
2020/05/29 14:55:53 Problem with dial: dial tcp 217.160.241.84:3306: getsockopt: connection refused. Sleeping 1s
2020/05/29 14:55:53 Problem with dial: dial tcp 217.160.241.84:25: getsockopt: connection refused. Sleeping 1s
Can you explain why the names of db, mda, mta and filter resolve to the same public ip address in the above example? -> 217.160.241.84 And why the non-default IP range of your second example (192.168.208.5) is completely different from above?
I think that this is not the right place to solve your Kubernetes/Docker Swam/... issues and I would kindly ask you to try this in a different place.
.. but to say something useful and just to be complete:
https://github.com/jeboehm/docker-mailserver/wiki/Troubleshooting#connection-errors-are-popping-up
Sorry for acting angrily, but it makes me crazy to see your whole environment changing everytime you post here ;P
I've improved the startup time of the virus service. This should make your tests easier, so make sure that you have the latest image version.
First I'd like to state that I appreciate your work very much, the more so as in the meantime I studied quite a number of other systems and didn't succeed with them either.
Furthermore I learned that any complete mail system is quite complicated and definitely nothing to be done easily.
Also I appreciate your time and effort trying to help me very much. Nevertheless, if you don't care anymore, it's okay with me.
For the record and for others tapping into this problem, I will continue, though, until I find a solution or give up. Talking to myself does help, too.
I'd like to see the first success I had when I installed your system the first time without problems, but I can't. I don't understand why this is so. A docker system should work out-of-the-box every time.
Now it looks like it works on IONOS_2
now. Don't know why. I'll have to inspect some more, though, then tell you the details and answer your questions.
On IONOS_2
:
IONOS_2
)IONOS_1
)Why? Obviously I have to change the A record of mail.zzz.tld
, which points to IONOS_1
, to the IP of IONOS_2
, hosting the mail server.
I was surprised I could add a domain -- so I will see in a while if I can skip setting up a separate mailserver on IONOS_1
which would be really great.
Still the question remains why IONOS_1
will not run as expected (and why IONOS_2
had these problems in the first place and works fine now).
I'm sorry that I confused you with my reports. Let me explain.
I started out with a server named IONOS_2
with 8GB RAM and 160GB SSD.
As I ran into problems when struggling with POP3 and SMTP and as the overall setup of IONOS_2
is more complicated, I tried to simplify things by switching to a 2nd server named IONOS_1
with 0.5GB RAM and 10GB SSD which has a much simpler setup.
On this machine, I experienced the same problems, however.
Can you explain why the names of db, mda, mta and filter resolve to the same public ip address in the above example? -> 217.160.241.84
Good catch.
This is the IP of my 2nd virtual server IONOS_2
. In an attack of paranoia, I had closed ports 25 110 143 587 993 995 except for the IP of my office machine. So this connection attempt could not succeed. This was an obvious misconception and easily corrected.
At the time when I posted this I wasn't aware of this problem, however. As I did the same with IONOS_1
, I spotted the problem later on that machine myself and could correct my mistake.
And why the non-default IP range of your second example (192.168.208.5) is completely different from above?
This is probably due to different experimental series. Docker picks random ranges (192.-, 172.-).
https://github.com/jeboehm/docker-mailserver/wiki/Troubleshooting#connection-errors-are-popping-up
Thank you for the advice. I had a look at this address. If it only took 2 minutes, it wouldn't be a problem, but it keeps on trying until the system gives up. So I thought it might a good idea to present you the whole output of the startup procedure. Please tell me if that is okay with you.
I'd like to debug this, but I don't know where to begin. As I proved to you, all the containers can ping each other, but still connection problems are reported.
I found something interesting at jwilder
:
timeout and wait-retry-interval not working
this was user error, I was adding the wait and timeout in an entrypoint where there was a second usage of dockerize in a dockerfile I wasnt aware of
Unfortunately, I don't know what to do with this.
In the other thread, I told you that I was struggling with the proxy to handle TLS and the client IP to be passed through to the docker zoo. In the meantime I have understood the problem and found a solution, so both machines work flawlessly except they need a mail system.
IONOS_2
uses swarm mode, IONOS_1
does not. There is really no difference except that in swarm mode, you can set up replicas and more machines and spread the whole system to these. As I don't have more machines at the moment, this doesn't really mean anything.
Actually the nginx proxy runs alone as a single container on both macheines (and not in swarm mode) and communicates with the rest of the band via external network. In this scenario, docker-mailserver would constitute another group of containers to be connected to from the proxy.
I'll have to study your instructions for a proxy server.
I think that once I get docker-mailserver to run on one system, I will be able to do it on the other as well.
For the record:
On IONOS_1
, the up procedure lasted more than 13 minutes until the system gave up:
Begin:
[33mfilter_1 |[0m 2020/06/03 14:03:46 Waiting for: tcp://virus.local:3310
[33mfilter_1 |[0m 2020/06/03 14:03:46 Problem with dial: dial tcp 192.168.224.3:3310: getsockopt: connection refused. Sleeping 1s
Last connection refused error
[36mfilter_1 |[0m 2020/06/03 14:16:55 Problem with dial: dial tcp 192.168.224.3:3310: getsockopt: connection refused. Sleeping 1s
root@IONOS_1: /root # docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
24117bd576e3 jeboehm/mailserver-mda:latest "/usr/local/bin/entr…" 9 hours ago Up 9 hours (unhealthy) 0.0.0.0:110->110/tcp, 0.0.0.0:143->143/tcp, 0.0.0.0:587->587/tcp, 0.0.0.0:993->993/tcp, 2003/tcp, 0.0.0.0:995->995/tcp, 4190/tcp docker-mailserver_mda_1
e329970ac6a0 jeboehm/mailserver-web:latest "docker-php-entrypoi…" 9 hours ago Exited (1) 8 hours ago docker-mailserver_web_1
ab81a3bce6d6 jeboehm/mailserver-mta:latest "/usr/local/bin/entr…" 9 hours ago Exited (1) 8 hours ago docker-mailserver_mta_1
1036ccca61d6 jeboehm/mailserver-filter:latest "/usr/local/bin/entr…" 9 hours ago Exited (1) 8 hours ago docker-mailserver_filter_1
68a496eae1a5 jeboehm/mailserver-db:latest "docker-entrypoint.s…" 9 hours ago Up 9 hours 3306/tcp, 33060/tcp docker-mailserver_db_1
bcb6c91823d9 jeboehm/mailserver-ssl:latest "/usr/local/bin/crea…" 9 hours ago Exited (0) 9 hours ago docker-mailserver_ssl_1
ee7d9a22d22b jeboehm/mailserver-virus:latest "/usr/local/bin/entr…" 9 hours ago Up 9 hours (healthy) 3310/tcp docker-mailserver_virus_1
cb4db08c45e1 jeboehm/mailserver-web:latest "docker-php-entrypoi…" 9 hours ago Exited (1) 9 hours ago docker-mailserver_200603_web_1
9131415946e6 jeboehm/mailserver-mta:latest "/usr/local/bin/entr…" 9 hours ago Exited (1) 9 hours ago docker-mailserver_200603_mta_1
77ca0e1288f7 jeboehm/mailserver-mda:latest "/usr/local/bin/entr…" 9 hours ago Exited (1) 9 hours ago docker-mailserver_200603_mda_1
Different picture at IONOS_2
, only 39s until last connection error
:
Begin:
Attaching to docker-mailserver_db_1, docker-mailserver_mta_1, docker-mailserver_virus_1, docker-mailserver_ssl_1, docker-mailserver_mda_1, docker-mailserver_web_1, docker-mailserver_filter_1
[36mdb_1 |[0m 2020-06-03 15:46:54+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.30-1debian10 started.
[36mdb_1 |[0m 2020-06-03 15:46:54+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql'
[36mdb_1 |[0m 2020-06-03 15:46:54+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.30-1debian10 started.
[36mdb_1 |[0m 2020-06-03T15:46:54.800385Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
[36mdb_1 |[0m 2020-06-03T15:46:54.802672Z 0 [Note] mysqld (mysqld 5.7.30) starting as process 1 ...
Last connection refused error
[33;1mweb_1 |[0m 2020/06/03 15:47:25 Problem with dial: dial tcp 172.31.0.2:25: getsockopt: connection refused. Sleeping 1s
root@IONOS_2: /root/docker-mailserver # docker ps -a | grep "jeb"
9ea58f48eb2c jeboehm/mailserver-filter:latest "/usr/local/bin/entr…" 46 minutes ago Up 46 minutes (healthy) 11332/tcp, 11334/tcp docker-mailserver_filter_1
c615f5303339 jeboehm/mailserver-mda:latest "/usr/local/bin/entr…" 46 minutes ago Up 46 minutes (healthy) 0.0.0.0:110->110/tcp, 0.0.0.0:143->143/tcp, 0.0.0.0:587->587/tcp, 0.0.0.0:993->993/tcp, 2003/tcp, 0.0.0.0:995->995/tcp, 4190/tcp docker-mailserver_mda_1
278a131bef6a jeboehm/mailserver-ssl:latest "/usr/local/bin/crea…" 46 minutes ago Exited (0) 46 minutes ago docker-mailserver_ssl_1
67e85b39d221 jeboehm/mailserver-db:latest "docker-entrypoint.s…" 46 minutes ago Up 46 minutes 3306/tcp, 33060/tcp docker-mailserver_db_1
cbd598402325 jeboehm/mailserver-web:latest "docker-php-entrypoi…" 46 minutes ago Up 46 minutes (healthy) 9000/tcp, 0.0.0.0:81->80/tcp docker-mailserver_web_1
f5a580009642 jeboehm/mailserver-virus:latest "/usr/local/bin/entr…" 46 minutes ago Up 46 minutes (healthy) 3310/tcp docker-mailserver_virus_1
1e636533ce3e jeboehm/mailserver-mta:latest "/usr/local/bin/entr…" 46 minutes ago Up 46 minutes (healthy) 0.0.0.0:25->25/tcp docker-mailserver_mta_1
There is an obvious difference in the startup sequence, IONOS_2
starting with db
, IONOS_1
starting with errors.
Both .env
are identical except for MAILNAME
Both are fresh cloned. There is enough space on both:
I begin to understand what's happening on IONOS_2
.
On both machines, IONOS_1
and IONOS_2
, I'd like to call the web presences with TLS -- there should be no discussion about that nowadays.
Also, I'm told, browsers should be instructed to never use a connection without TLS, which is preventing to use docker-mailserver on port 81 without TLS.
My first workaround was to introduce a separate FQDN running without TLS which would be served by docker-mailserver, and it works, but that doesn't feel right.
Next I tried jwilder/ngnx-proxy
(161MB) according to your instructions, but it didn't work (502 Bad Gateway
). I didn't have the guts to find the bug so I looked for a different solution.
I already do have an NGINX proxy 2proxy
(21MB) which serves ports 80 and 443 for the main FQDN and talks to my docker-compose band xx
via an external network proxy
. So the most natural thing would be to include the external network proxy
to the docker-mailserver docker-compose.yml
as well.
This works, but in order for the proxy to be able to see the container web, this container must be exposed to that network.
And here we have it (from docker logs <ID of jeboehm/mailserver-web:latest>
):
web_1 | 2020/06/05 22:06:02 Problem with dial: dial tcp 217.160.241.84:3306: getsockopt: connection refused. Sleeping 1s
Mind you, this is the IP of the machine the whole stuff runs on, IONOS_2
. Port 3306
doesn't make sense here.
Of course, web
crashes and finally gives up.
How come? How could a simple
networks:
- proxy
have this consequence?
I introduced a second external network mail
for the jeboehm
suite, the 2proxy
serving both networks proxy
and mail
, so that docker-mailserver
has no contact to the other band xx
, to no avail.
Then I removed mail
from 2proxy
, so docker-mailserver
is exposed to a network no other container uses: web
again exited after 2 minutes.
2020/06/06 00:31:33 Problem with dial: dial tcp 217.160.241.84:3306: getsockopt: connection refused. Sleeping 1s
2020/06/06 00:31:33 Problem with dial: dial tcp 217.160.241.84:11334: getsockopt: connection refused. Sleeping 1s
2020/06/06 00:31:33 Timeout after 2m0s waiting on dependencies to become available: [tcp://db:3306 tcp://mda:143 tcp://mta:25 tcp://filter:11334 file:///media/dkim/]
What makes web
behave this way?
Actually, I made protocols to be able to inspect later and found 2 entries with the host IP from the workaround with the additional FQDN:
root@IONOS_2: /root/docker-mailserver # cat ms.2020-06-05___16:42:00.log | grep "\.84"
mta_1 | 2020/06/05 14:42:02 Problem with dial: dial tcp 217.160.241.84:11332: getsockopt: connection refused. Sleeping 1s
web_1 | 2020/06/05 14:42:02 Problem with dial: dial tcp 217.160.241.84:11334: getsockopt: connection refused. Sleeping 1s
This did not lead to problems, though. So even in regular use, this call to port 1133x
on the host IP seems to be ok.
For POP and SMTP, I suspect that the configuration of mta
is not ok which, if true, would be the reason no configuration of Thunderbird succeeds. I will inspect this next.
IONOS_1
is a different thing, docker-mailserver
will not run at all. Still investigating why.
So, in retrospect, the problems started when I introduced a network on IONOS_2
. And when I switched to IONOS_1
I ran into the problems I cannot solve yet.
IONOS_1
:
I was in the process of describing the error conditions on IONOS_1
when, all of a sudden, doing a simple bin/production.sh up
on a running and partly broken system, everything was fine.
Imagine my surprise!
I took this system down with bin/production.sh down
and restarted it again with bin/production.sh up
, getting the well-known errors, then did a refresh with bin/production.sh up
and again the system was fine.
I could reproduce this twice, then no more for half a dozen trials, then again ok.
How can I debug this?
Next I opened port 81 on IONOS_1
, installed a user and made sure the system would run on an additional FQDN, same trick as withIONOS_2
, and it worked, too. Then I sent an e-mail, but it didn't arrive. I forwarded this e-mail to another address, but this one didn't arrive either.
Now there was something dubious with my setup. I used the same FQDN on both accounts, IONOS_1
and IONOS_2
. This looks awkward. So I installed MX records to my testing FQDNs delivering the mail service in order to have more domains to add.
I deleted the domain for the duplicate domain and added the new FQDN, added users, sent emails, worked. Great.
So far so good. If it works, it works. If it doesn't -- I don't know why. How can I find out?
Looking at the logs, we have the same error messages of type
2020/06/06 16:36:06 Problem with dial: dial tcp 172.27.0.6:25: getsockopt: connection refused. Sleeping 1s
2020/06/06 16:36:06 Problem with dial: dial tcp 172.27.0.3:11334: getsockopt: connection refused. Sleeping 1s
2020/06/06 16:36:07 Timeout after 2m0s waiting on dependencies to become available: [tcp://db:3306 tcp://mda:143 tcp://mta:25 tcp://filter:11334 file:///media/dkim/]
Checking all those containers with ping
reveals that there is no problem except that filter
refuses to execute ping (ping: permission denied (are you root?)
) and db
has no ping
(OCI runtime exec failed: exec failed: container_linux.go:348: starting container process caused "exec: \"6e3764131f13\": executable file not found in $PATH": unknown
).
Do you have an idea what could be causing this?
To me, this looks like a timing error. If I understood correctly, you added waiting states using dockerize. Where would I manipulate these? I guess I should enlarge the wait interval.
As far as I can see it the retrial interval is pretty short, the log is filled with error messages. It looks like every second it is firing. Would it be promising to enlarge this interval? I don't think that this interval is given by dockerize, rather by the logic of the container. Where to look for? Are there any parameters to be set?
By the way, I got my testing FQDNs via freenom.com
(I chose TLDs tk
, ga
, ga
, pointed one to IONOS_2
and the other 2 to IONOS_1
).
Thinking about it, I could live with this workaround forever. On the other hand, I'd really like to know why adding a network to docker-compose.yml / web
produces this strange error as reported 7 hours ago.
Interesting: I tried to get a letsenscrypt certificate for the tk
domain, but failed. It looks like they don't want to process these TLDs.
I think, I managed to develop an acceptable workaround.
On this machine, I can bring up the system, although most of the time it would take a lot of retries. I don't know why this is so, but as I found out that I can serve all my FQDNs from IONOS_2
, I'm not inclined to investigate this case any further.
In order to ease this tedious procedure, I finally wrote two shell scripts, 1mu
, 1md
(denoting mail up and mail down on IONOS_1
), and joined them like so:
1md; 1mu; docker ps -a | grep "jeb" | grep -v "ssl:"
to see the result and then repeatedly call docker ps -a | grep "jeb" | grep -v "ssl:"
to see the progress.
root@IONOS_1: /root # cat ~/bin/1mu
#!/bin/sh
datum=$(date +%Y-%m-%d___%H:%M:%S) # better reading
cd ~/docker-mailserver
# check disk usage
df | grep "map" | tee ms.$datum.log
# remove old instances
docker ps -a | grep 'jeb'| awk {'print $1'} |xargs docker rm | tee ms.$datum.log
# better leave volumes, takes time to create
# docker volume prune | tee ms.$datum.log
# in case we do prune see what we gained
# df | grep "map" | tee ms.$datum.log
# start `docker-mailserver`
#bin/production.sh up | tee ms.$datum.log
bin/production.sh up -d
root@IONOS_1: /root # cat ~/bin/1md
#!/bin/sh
~/docker-mailserver/bin/production.sh down
I put these to /root/bin
as this directory is in PATH
.
On IONOS_1
, the phenomenon is that regularly 3 containers exit after the maximum number of restarts:
mta
web
filter
And if I try often enough, it works. Really enigmatic.
docker logs <ID of mta>
2020/06/08 14:36:51 Problem with dial: dial tcp 192.168.144.5:11332: getsockopt: connection refused. Sleeping 1s
2020/06/08 14:36:52 Timeout after 1m0s waiting on dependencies to become available: [tcp://db:3306 tcp://mda:2003 tcp://filter:11332 file:///media/tls/mailserver.crt file:///media/tls/mailserver.key]
docker logs <ID of web>
2020/06/08 14:36:51 Problem with dial: dial tcp 192.168.144.3:25: getsockopt: connection refused. Sleeping 1s
2020/06/08 14:36:51 Problem with dial: dial tcp 192.168.144.5:11334: getsockopt: connection refused. Sleeping 1s
2020/06/08 14:36:52 Timeout after 1m0s waiting on dependencies to become available: [tcp://db:3306 tcp://mda:143 tcp://mta:25 tcp://filter:11334 file:///media/dkim/]
docker logs <ID of filter>
2020/06/08 14:37:20 Problem with dial: dial tcp 192.168.144.2:3310: getsockopt: connection refused. Sleeping 1s
2020/06/08 14:37:21 Timeout after 1m0s waiting on dependencies to become available: [tcp://virus.local:3310]
I changed WAITSTART_TIMEOUT=2m
to WAITSTART_TIMEOUT=4m
and WAITSTART_TIMEOUT=1m
, it made no difference.
The situation is different on this machine. As a rule, the system will start. But sometimes mda
will stay unhealthy. I found a workaround at https://github.com/jeboehm/docker-mailserver/issues/88:
id=<container ID>; docker container stop $id; docker container start $id
This trick worked every time.
So finally I was confident that I could bring up docker-mailserver on IONOS_2
every time and heal the system if necessary. Now this didn't mean, that the system worked fine. I spent more than a day trying to find out what the problem is.
On the mail manager system, I installed 2 domains and appropriate users. I could send e-mails from those domains to other e-mail addresses I own, but I could not receive e-mails to these addresses. Also, I wanted to use Thunderbird, which was the begin of all the trouble.
I couldn't understand why I was able to do so the day before, as I documented here. The solution to this enigma was quite simple, but I had a hard time to find out.
The successful tests I had performed were e-mails sent from the webmail interface to both of the domains. I obviously didn't test to send e-mails from other, foreign addresses like Google Mail.
In one of my tests the days before, I received an error message from GMX with instructions of how to get more information about the rejection. One of the services to ask was https://mxtoolbox.com/SuperTool.aspx?action=smtp%3avoxx.biz&run=toolpage
I luckily remembered this and found that something was not as it should be: I had a SMTP Reverse DNS Mismatch
Reverse DNS is not a Valid Hostname
in the first line although in the next line I was told that the IP address of IONOS_2
had been resolved correctly.
This was really enigmatic. Googling told me that the PTR
has to be set by the provider. I was confused as to which FQDN
I should use -- the main address of the machine which required TLS
or the other address I just set up for free to serve docker-mailserver
at port 81
.
Finally I called the support of IONOS and they told me that I not only could set up the PTR
myself, but also that I already had done this. They couldn't find a mistake.
My intuition told me that the PTR
entry was not a FQDN
-- I could faintly remember I had stumbled upon this mistake before. The service confirmed that the PTR
of form zzz.tld
was totally okay, but after some discussion and googling he agreed that a FQDN
has to have 3 parts: kkk.zzz.tld
.
Okay, I had already set up an A record mail.zzz.tld
, and it was easy to test it. I didn't even have to wait. The change at IONOS was done immediately and the test showed that the problem was gone.
And then came the surprise: all the e-mails which couldn't have been delivered during the last 2 days all came in in a rush for both domains.
The problem with Thunderbird was solved yesterday, but because of the delivery problem, I couldn't really appreciate it. I noticed during my testing, that one of the foreign addresses reacted immediately in Thunderbird, whereas others took their time unless I manually downloaded new e-mails. Inspecting the differences, I noticed that the latter were POP
accounts, the former IMAP
.
So I learned that it is not easy to transform a POP
account to IMAP
in Thunderbird. You practically immobilize the POP
account and set up a new IMAP
account. If all works out fine, you may move all of the items of the POP
account to the IMAP
account.
Setting up a new account in Thunderbird, you can choose between POP
and IMAP
, and then you can choose manually
for the connection details and let Thunderbird find out which conditions the mail server requires. It turned out that both settings require STARTTLS
and Password, normal
, with the PTR
entry as server. You may even omit the first part, so zzz.tld
will work as well.
That completes the whole setup.
The error of form
2020/06/06 00:31:33 Problem with dial: dial tcp 217.160.241.84:3306: getsockopt: connection refused. Sleeping 1s
2020/06/06 00:31:33 Problem with dial: dial tcp 217.160.241.84:11334: getsockopt: connection refused. Sleeping 1s
2020/06/06 00:31:33 Timeout after 2m0s waiting on dependencies to become available: [tcp://db:3306 tcp://mda:143 tcp://mta:25 tcp://filter:11334 file:///media/dkim/]
reported above can be produced in 2 ways, adding a network being one. Looking at
/root/docker-mailserver/web/rootfs/usr/local/bin/entrypoint.sh
, there is a parameter FILTER_HOST
set in the Dockerfile
:
root@IONOS_1: /root # grep -rn '/root/docker-mailserver'/ -e "FILTER_HOST" | grep -v ':#' | grep -v ':wp_'
/root/docker-mailserver/web/Dockerfile:23: FILTER_HOST=filter \
/root/docker-mailserver/web/rootfs/usr/local/bin/entrypoint.sh:33: -wait tcp://${FILTER_HOST}:11334 \
How come that filter
is conquered by the IP of the host
?
/ $ grep -rn '/etc/rspamd' -e '11334' | grep -v ':#' | grep -v ':wp_'
/etc/rspamd/local.d/worker-controller.inc:1:bind_socket = "*:11334";
/etc/rspamd/local.d/worker-controller.inc.templ:1:bind_socket = "*:11334";
/etc/rspamd/worker-controller.conf:4: bind_socket = "localhost:11334";
I guess the problem lies here, But this definitely does not fall into the realm of docker-mailserver
.
There is one more remark:
Thunderbird does not accept the security certificate.
Therefore I have to store an exception.
As I have a letsencrypt certificate for the main domain, maybe I somehow could manage that Thunderbird would get and accept this, but I don't know yet if I will take the pain to research this.
Another addendum to make things clear for anybody who runs into the same situation as I did:
In my humble attempt to mediate between the need for docker-mailserver
to talk to a port and to TLS for my main domain on IONOS_2
, I introduced a new domain, let's call it port81.tk
, to serve docker-mailserver
on port 81
.
In the admin interface of webmail
, I created 2 domains, my main domain, let's call it zzz.tld
, to be served otherwise via an nginx
proxy with TLS on IONOS_2
, and a 2nd domain, let's call it uuu.tld
, which is served by a totally different machine (IONOS_1
).
Now the question arises which information has been put where to make things work. So far I was experimenting and found out that the following seems to work well:
.env
, the entry has to be MAILNAME=port81.tk
IMAP
account, although it looks like it works with POP
just the same, if you do it right.zzz.tld
IONOS_2
should have an MX record and A records of type mail, imap, smtp, although I'm not sure about that. Maybe you don't need it, but certainly it doesn't hurt._dmarc
(DMARC), 2020._domainkey.zzz.tld
(DKIM), mail.zzz.tld
(SPF). The same applies to uuu.tld
. I don't know if the names of those entries matter, maybe not.IONOS_2
to point the IP of that machine to zzz.tld
.The last mistake I made was to change MAILNAME=port81.tk
to MAILNAME=zzz.tld
in an attempt to get rid of the Thunderbird complaint about security. The consequence was, that an e-mail to an address at zzz.tld
bounced with the message that the recipient was unknown. Let's say the recipient's e-mail address would be me@zzz.tld
, then the unknown recipient was me
.
This kind of error happened to lots of people (e.g. postfix virtualdomain - message bouncing - unknown user error in maillog), so I quickly found the reason and changed the line myhostname = zzz.tld
in /etc/postfix/main.cf
in the mta
container to myhostname = port81.tk
and the error was gone. So I concluded that this entry came from .env
command, and I was right.
Interestingly, I now have duplicated accounts for those 2 domains on Thunderbird, each of POP
and IMAP
flavor. It turns out that I can send from each of those without problems, but incoming mails are put into the IMAP
account.
The reason must be that docker-mailserver
sends on port 143 to which theIMAP
account listens, whereas the POP
account listens to port 110.
One of those accounts is very old and has lots of data, the other is new and has very few data. So this one is the perfect candidate to find out how to get rid of a Thunderbird account without losing data. With drag-and-drop I moved the inbox messages from the POP
to the IMAP
flavor, and then with the same method for the sent messages. How to delete an account? Check the settings, and on the left side at bottom you'll find a drop-down with the opportunity to add or remove accounts.
For the other one I have to think about another method. As far as I know Thunderbird manages data simply in directories, so that might be easier to do.
Maybe my attempt to get rid of the security complaint was flawed. There may be a configuration which does the right thing like it should, but at the moment I am exhausted and satisfied so far. It works as it should and those quirks don't really annoying me.
Describe the bug A clear and concise description of what the bug is.
After hours of trial and error I decided to refresh my system and start fresh once more. But there it is, the error I see for days now, so it can be reproduced.
To Reproduce Steps to reproduce the behavior:
see above
Expected behavior A clear and concise description of what you expected to happen.
I witnessed a successful installation before, so I know what to expect.
Screenshots If applicable, add screenshots to help explain your problem.
Docker environment (please complete the following information):
docker info
see below...
docker-compose version
No way to find an explanation for this "Error response from daemon". I had this before when I did my first refresh, and had then found a simple downgrade instruction, but could not find it today.
Instead I found https://github.com/kubernetes-sigs/kubespray/issues/6160 with a trick:
Additional context Add any other context about the problem here.