docker / for-linux

Docker Engine for Linux
https://docs.docker.com/engine/installation/
756 stars 85 forks source link

Docker resolve container to ghost ip periodically #638

Open andyceo opened 5 years ago

andyceo commented 5 years ago

Expected behavior

Docker resolves container by it's service name explicitly to 1 ip address, for example I have one docker swarm service redis with only one container created for it.

I can access to this redis container from inside other containers attached to common network using service name:

62d54058d345:/data# ping redis
PING redis (10.0.2.5): 56 data bytes
64 bytes from 10.0.2.5: seq=0 ttl=64 time=0.126 ms
64 bytes from 10.0.2.5: seq=1 ttl=64 time=0.110 ms
^C
--- redis ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.110/0.118/0.126 ms

Actual behavior

Docker sometimes resolves redis name to some unknown ghost ip, this looks like that:

62d54058d345:/data# ping redis
PING redis (10.0.2.5): 56 data bytes
64 bytes from 10.0.2.5: seq=0 ttl=64 time=0.126 ms
64 bytes from 10.0.2.5: seq=1 ttl=64 time=0.110 ms
^C
--- redis ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.110/0.118/0.126 ms
# this is right container ip =  10.0.2.5

62d54058d345:/data# ping redis
PING redis (10.0.2.5): 56 data bytes
64 bytes from 10.0.2.5: seq=0 ttl=64 time=0.100 ms
^C
--- redis ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.100/0.100/0.100 ms
62d54058d345:/data# ping redis
PING redis (10.0.2.5): 56 data bytes
64 bytes from 10.0.2.5: seq=0 ttl=64 time=0.128 ms
^C
--- redis ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.128/0.128/0.128 ms
# this is right container ip =  10.0.2.5

62d54058d345:/data# ping redis
PING redis (10.0.2.21): 56 data bytes
64 bytes from 10.0.2.21: seq=0 ttl=64 time=0.100 ms
^C
--- redis ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.100/0.100/0.100 ms
# this is WRONG container ip =  10.0.2.21

62d54058d345:/data# ping redis
PING redis (10.0.2.5): 56 data bytes
64 bytes from 10.0.2.5: seq=0 ttl=64 time=0.090 ms
^C
--- redis ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.090/0.090/0.090 ms
# this is right container ip =  10.0.2.5

62d54058d345:/data# ping redis
PING redis (10.0.2.5): 56 data bytes
64 bytes from 10.0.2.5: seq=0 ttl=64 time=0.153 ms
^C
--- redis ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.153/0.153/0.153 ms
# this is right container ip =  10.0.2.5

62d54058d345:/data# ping redis
PING redis (10.0.2.21): 56 data bytes
64 bytes from 10.0.2.21: seq=0 ttl=64 time=0.121 ms
64 bytes from 10.0.2.21: seq=1 ttl=64 time=0.110 ms
^C
--- redis ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.110/0.115/0.121 ms
# this is WRONG container ip =  10.0.2.21

I have no any containers with ip = 10.0.2.21 in this network:

andyceo@newhope:~$ sudo docker network inspect databases_redis
[
    {
        "Name": "databases_redis",
        "Id": "lawvqp868jenpnzs7f4gg7pwv",
        "Created": "2019-03-30T21:06:29.098125664+03:00",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.2.0/24",
                    "Gateway": "10.0.2.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "597079393d9d6fb4b8bb08c62dc8ae64d7a14d64e71a6232735c74a37afcd605": {
                "Name": "databases_redis.1.0bt6r2uobfr3f8mz14jcmch63",
                "EndpointID": "26ffb010530c04b86542c1461dde8025ab2205ab2c39f0a6fa69139021d92e82",
                "MacAddress": "02:42:0a:00:02:1f",
                "IPv4Address": "10.0.2.31/24",
                "IPv6Address": ""
            },
            "62d54058d34577745d438dd2070d516e6901a07f535509625f79fb53c0e687f3": {
                "Name": "databasesbckp_redis.1.zdgp0ybem9l7m0c153vazevxd",
                "EndpointID": "0d009e23df96840d2a372d66fd097f137956d004f80620b8e09dee2583af04e1",
                "MacAddress": "02:42:0a:00:02:1c",
                "IPv4Address": "10.0.2.28/24",
                "IPv6Address": ""
            },
            "b1ae33a8876e1218f237bf334eb18ff0a543a0767cfa20ac0a6fd5b83a67b0f8": {
                "Name": "monitoring_grafana.1.nbss9gq2l7jziezfqhm4lk1ju",
                "EndpointID": "459c557564de06416dccd45ced5c76f90d1a98c50da33d6708bde40a3f3eeab4",
                "MacAddress": "02:42:0a:00:02:1d",
                "IPv4Address": "10.0.2.29/24",
                "IPv6Address": ""
            },
            "d684c4f73f5fcc5da607fc8e09ff214128557e4f46c43d3752fd4a1615c113b5": {
                "Name": "developers_d01.1.enofax4wtolndvyph8acu05q2",
                "EndpointID": "9989851ef2c3d4029d8f3277950274572186a48fdb810100d51b80b85affacf4",
                "MacAddress": "02:42:0a:00:02:20",
                "IPv4Address": "10.0.2.32/24",
                "IPv6Address": ""
            },
            "lb-databases_redis": {
                "Name": "databases_redis-endpoint",
                "EndpointID": "a909eeb31a62ee5da700f511ca11a431acd6108f792b878cb6302c67d6bfb652",
                "MacAddress": "02:42:0a:00:02:02",
                "IPv4Address": "10.0.2.2/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4099"
        },
        "Labels": {
            "com.docker.stack.namespace": "databases"
        },
        "Peers": [
            {
                "Name": "846ce9f7bb28",
                "IP": "192.168.2.237"
            }
        ]
    }
]

I also did a list of all containers from all nodes with it's name and ip, it looks like:

10.0.10.50 monitoring_certificates.1.orggxw7fyuj3y0akkxy9l9pfp
10.0.15.75 httpd_nginx.1.vjon6i37dh288h8uh86pkd9wq
...
10.0.15.7710.0.10.53 monitoring_telegraf.1.8r0haxmfnzsegxgg2cqledhm8 

I did not find any container with ip = 10.0.2.21 in this list.

I did this list with following command (if that matters):

docker ps -q | xargs -n 1 docker inspect --format '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}} {{ .Name }}' | sed 's/ \// /'

I also tried to investigate what is this mysterious container should be by port scanning inside out another container in this network:

62d54058d345:/data# nmap  -p0-65535 10.0.2.21 -T5
Starting Nmap 7.70 ( https://nmap.org ) at 2019-04-01 12:15 UTC
Warning: 10.0.2.21 giving up on port because retransmission cap hit (2).
Nmap scan report for 10.0.2.21
Host is up (0.00015s latency).
Not shown: 65529 closed ports
PORT      STATE    SERVICE
16521/tcp filtered unknown
30600/tcp filtered unknown
35885/tcp filtered unknown
50057/tcp filtered unknown
53747/tcp filtered unknown
55251/tcp filtered unknown
58641/tcp filtered unknown
MAC Address: 02:42:0A:00:02:02 (Unknown)

Nmap done: 1 IP address (1 host up) scanned in 180.63 seconds

I did not have any services created by me with such open ports.

Steps to reproduce the behavior

Unknown. I just used Docker Swarm without any problems for months, sometimes upgrade it and so on. Today I discovered that my redis-backup container sometimes can not connect to redis service in another container.

Output of docker version:

andyceo@newhope:~$ sudo docker version
Client:
 Version:           18.09.4
 API version:       1.39
 Go version:        go1.10.8
 Git commit:        d14af54
 Built:             Wed Mar 27 18:34:51 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.4
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.8
  Git commit:       d14af54
  Built:            Wed Mar 27 18:01:48 2019
  OS/Arch:          linux/amd64
  Experimental:     false

Output of docker info:

andyceo@newhope:~$ sudo docker info
Containers: 38
 Running: 29
 Paused: 0
 Stopped: 9
Images: 31
Server Version: 18.09.4
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
 NodeID: 4iywpp5y3g770wy7qoi6e2rem
 Is Manager: true
 ClusterID: h6zgtigz5dwc1v3iu0zqidxga
 Managers: 3
 Nodes: 4
 Default Address Pool: 10.0.0.0/8  
 SubnetSize: 24
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Autolock Managers: false
 Root Rotation In Progress: false
 Node Address: 192.168.2.237
 Manager Addresses:
  192.168.2.237:2377
  192.168.2.239:2377
  192.168.2.241:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-143-generic
Operating System: Ubuntu 16.04.6 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.54GiB
Name: newhope
ID: IMMD:MMDH:CUXV:LCEK:U2GZ:IYOM:DOYY:M376:EE4U:LTKU:CQWM:QVBU
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

WARNING: No swap limit support

Additional environment details (AWS, VirtualBox, physical, etc.)

Nothing to add. All nodes are bare metal hosts, with Ubuntu 16.04 upgraded to latest packages and kernel, with only docker installed.

johnmccabe commented 5 years ago

I'm seeing similar behaviour on Ubuntu 18.04.2 LTS (arm64), with both 18.09_04 and 18.06.3-ce,