weaveworks / weave

Simple, resilient multi-host containers networking and more.
https://www.weave.works
Apache License 2.0
6.62k stars 665 forks source link

Peers are stuck at 1 on all my nodes #3450

Open mploschiavo opened 5 years ago

mploschiavo commented 5 years ago

I have a 10 node cluster using weave overlay. I cant access a container on the weave overlay network that has a port exposed and service listening. (I have another instance using weave and it worked, but can't figure out why this doesnt work.) curl 127.0.0.1:6784/status shows only 1 peer in my non-working instance, but in my working instance, its should 10.

netID:3z93n644nbe47dzi6r1a82git leaving:false netPeers:9 entries:22 Queue qLen:0 netMsg/s:0"
Nov 12 16:33:28 mattmenv-mgr-000000 dockerd[1457]: time="2018-11-12T16:33:28.797244846Z" level=info msg="NetworkDB stats mattmenv-mgr-000000(410061a80d70) - netID:muh8ka6b5ve2stwzjhp3o0ovi leaving:false netPeers:10 entries:34 Queue qLen:0 netMsg/s:0"
Nov 12 16:38:28 mattmenv-mgr-000000 dockerd[1457]: time="2018-11-12T16:38:28.997105412Z" level=info msg="NetworkDB stats mattmenv-mgr-000000(410061a80d70) - netID:muh8ka6b5ve2stwzjhp3o0ovi leaving:false netPeers:10 entries:34 Queue qLen:0 netMsg/s:0"
Nov 12 16:38:28 mattmenv-mgr-000000 dockerd[1457]: time="2018-11-12T16:38:28.997217013Z" level=info msg="NetworkDB stats mattmenv-mgr-000000(410061a80d70) - netID:3z93n644nbe47dzi6r1a82git leaving:false netPeers:9 entries:22 Queue qLen:0 netMsg/s:0"
ubuntu@mattmenv-mgr-000000:~$ curl 127.0.0.1:6784/status
        Version: 2.5.0 (up to date; next check at 2018/11/12 18:14:17)

        Service: router
       Protocol: weave 1..2
           Name: ee:b6:f2:df:50:37(mattmenv-mgr-000000)
     Encryption: disabled
  PeerDiscovery: enabled
        Targets: 0
    Connections: 0
          Peers: 1
 TrustedSubnets: none

        Service: ipam
         Status: idle
          Range: 10.32.0.0/12
  DefaultSubnet: 10.32.0.0/12

        Service: plugin (v2)
ubuntu@mattmenv-mgr-000000:~$ ^C
ubuntu@mattmenv-mgr-000000:~$ docker version
Client:
 Version:       18.03.0-ce
 API version:   1.37
 Go version:    go1.9.4
 Git commit:    0520e24
 Built: Wed Mar 21 23:10:01 2018
 OS/Arch:       linux/amd64
 Experimental:  false
 Orchestrator:  swarm

Server:
 Engine:
  Version:      18.03.0-ce
  API version:  1.37 (minimum version 1.12)
  Go version:   go1.9.4
  Git commit:   0520e24
  Built:        Wed Mar 21 23:08:31 2018
  OS/Arch:      linux/amd64
  Experimental: false
ubuntu@mattmenv-mgr-000000:~$ uname -a
Linux mattmenv-mgr-000000 4.15.0-1021-azure #21~16.04.1-Ubuntu SMP Fri Aug 10 12:36:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
ubuntu@mattmenv-mgr-000000:~$ docker node ls
ID                            HOSTNAME                       STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
kyqsleow148iy62wwq3777mdo     mattmenv-es-hdfs-data-000000   Ready               Active                                  18.03.0-ce
x3dccbrekunqoeqtteto73raa     mattmenv-es-hdfs-nn1-000000    Ready               Active                                  18.03.0-ce
vksphwv268mq49qloni6okib6     mattmenv-es-log-000000         Ready               Active                                  18.03.0-ce
j3nq0layy5k0s11l61mwiyi9s     mattmenv-hdfs-data-000000      Ready               Active                                  18.03.0-ce
gkh55qgq34uyuyexodeahs05h *   mattmenv-mgr-000000            Ready               Active              Leader              18.03.0-ce
5iqyx5c0lxxkyar9sc2jvbxcw     mattmenv-storm-000000          Ready               Active                                  18.03.0-ce
7ixl2leriafaa8ppicrwh636n     mattmenv-storm-000001          Ready               Active                                  18.03.0-ce
hqrxwdgfvttbt47wc87b0iu0s     mattmenv-storm-000002          Ready               Active                                  18.03.0-ce
tdc9derzf3axwije2fibsxrdg     mattmenv-util-000000           Ready               Active                                  18.03.0-ce
yc52uexb9y01i95y4fj5ztg13     mattmenv-zk-kafka-000000       Ready               Active                                  18.03.0-ce
ubuntu@mattmenv-mgr-000000:~$ docker network ls
NETWORK ID          NAME                DRIVER                                 SCOPE
d75d6c560d80        bridge              bridge                                 local
a7e03ba22fc2        docker_gwbridge     bridge                                 local
2eac1456de46        host                host                                   local
muh8ka6b5ve2        ingress             overlay                                swarm
4lhe672o3dsz        mynetwork           weaveworks/net-plugin:latest_release   swarm
87f4d37a3e9d        none                null                                   local
3z93n644nbe4        rsnetwork           weaveworks/net-plugin:latest_release   swarm
ubuntu@mattmenv-mgr-000000:~$ docker network inspect rsnetwork
[
    {
        "Name": "rsnetwork",
        "Id": "3z93n644nbe47dzi6r1a82git",
        "Created": "2018-11-09T23:07:08.903449501Z",
        "Scope": "swarm",
        "Driver": "weaveworks/net-plugin:latest_release",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.0.0/24",
                    "Gateway": "10.0.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": true,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "3a98c542e3d36a6a01c6bcd8ce11e354b1f3b6efb79644bd8216aa9888e6022a": {
                "Name": "id-manager.1.k0o81294oyrdak9egt09mupkh",
                "EndpointID": "0ee79017964be0b67af5c55dd5af5b4daf4bff55f849d12b9e50f4387d108bf6",
                "MacAddress": "fe:d2:0c:d3:2f:bc",
                "IPv4Address": "10.0.0.4/24",
                "IPv6Address": ""
            },
            "6ff4974013afa5035c25d8810659a4768949a55e36f01ff4144a72101fb02cc2": {
                "Name": "rdp_dev_filebeat-tech.gkh55qgq34uyuyexodeahs05h.uapuxa1837vyzuqoupj0tdkdp",
                "EndpointID": "70667893d838b804dfc59edcaa2e6c11d4ab9a13f0a8cef4c531d0cc88e44a27",
                "MacAddress": "e2:f6:07:da:b0:80",
                "IPv4Address": "10.0.0.10/24",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {},
        "Peers": [
            {
                "Name": "410061a80d70",
                "IP": "10.119.5.4"
            },
            {
                "Name": "8a427c37d35f",
                "IP": "10.119.5.8"
            },
            {
                "Name": "29c7bd54059f",
                "IP": "10.119.5.9"
            },
            {
                "Name": "52727365fc82",
                "IP": "10.119.5.6"
            },
            {
                "Name": "0d7b99140a94",
                "IP": "10.119.5.10"
            },
            {
                "Name": "e9cec0e1de59",
                "IP": "10.119.5.7"
            },
            {
                "Name": "f6995adc522e",
                "IP": "10.119.5.5"
            },
            {
                "Name": "455b445f5c27",
                "IP": "10.119.5.13"
            },
            {
                "Name": "fa049ae9bb95",
                "IP": "10.119.5.11"
            }
        ]
    }
]
ubuntu@mattmenv-mgr-000000:~$ curl 127.0.0.1:6784/status
        Version: 2.5.0 (up to date; next check at 2018/11/12 18:14:17)

        Service: router
       Protocol: weave 1..2
           Name: ee:b6:f2:df:50:37(mattmenv-mgr-000000)
     Encryption: disabled
  PeerDiscovery: enabled
        Targets: 0
    Connections: 0
          Peers: 1
 TrustedSubnets: none

        Service: ipam
         Status: idle
          Range: 10.32.0.0/12
  DefaultSubnet: 10.32.0.0/12

        Service: plugin (v2)
ubuntu@mattmenv-mgr-000000:~$ docker ps
CONTAINER ID        IMAGE                                                            COMMAND                  CREATED             STATUS              PORTS                    NAMES
6ff4974013af        579800346274.dkr.ecr.us-west-2.amazonaws.com/filebeat-tech:2.4   "/usr/share/entrypoi…"   2 days ago          Up 2 days                                    rdp_dev_filebeat-tech.gkh55qgq34uyuyexodeahs05h.uapuxa1837vyzuqoupj0tdkdp
3a98c542e3d3        rscaptain/id-manager:1.5                                         "uwsgi --http 0.0.0.…"   2 days ago          Up 2 days           18001/tcp                id-manager.1.k0o81294oyrdak9egt09mupkh
714253ee0f85        docker4x/l4controller-azure:18.03.0-ce-azure1                    "loadbalancer run --…"   2 days ago          Up 2 days                                    editions_controller
769e04e15e97        rscaptain/swarm-meta:1.1                                         "uwsgi --http 0.0.0.…"   2 days ago          Up 2 days           0.0.0.0:9024->8000/tcp   meta
0cd6ba80ad16        rscaptain/guide-azure:1.0                                        "/entry.sh"              2 days ago          Up 2 days                                    editions_guide
2c2085c00260        docker4x/logger-azure:18.03.0-ce-azure1                          "python /server.py"      2 days ago          Up 2 days           0.0.0.0:514->514/udp     editions_logger
ubuntu@mattmenv-mgr-000000:~$ docker node ls
ID                            HOSTNAME                       STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
kyqsleow148iy62wwq3777mdo     mattmenv-es-hdfs-data-000000   Ready               Active                                  18.03.0-ce
x3dccbrekunqoeqtteto73raa     mattmenv-es-hdfs-nn1-000000    Ready               Active                                  18.03.0-ce
vksphwv268mq49qloni6okib6     mattmenv-es-log-000000         Ready               Active                                  18.03.0-ce
j3nq0layy5k0s11l61mwiyi9s     mattmenv-hdfs-data-000000      Ready               Active                                  18.03.0-ce
gkh55qgq34uyuyexodeahs05h *   mattmenv-mgr-000000            Ready               Active              Leader              18.03.0-ce
5iqyx5c0lxxkyar9sc2jvbxcw     mattmenv-storm-000000          Ready               Active                                  18.03.0-ce
7ixl2leriafaa8ppicrwh636n     mattmenv-storm-000001          Ready               Active                                  18.03.0-ce
hqrxwdgfvttbt47wc87b0iu0s     mattmenv-storm-000002          Ready               Active                                  18.03.0-ce
tdc9derzf3axwije2fibsxrdg     mattmenv-util-000000           Ready               Active                                  18.03.0-ce
yc52uexb9y01i95y4fj5ztg13     mattmenv-zk-kafka-000000       Ready               Active                                  18.03.0-ce
ubuntu@mattmenv-mgr-000000:~$ ssh mattmenv-es-hdfs-data-000000
The authenticity of host 'mattmenv-es-hdfs-data-000000 (10.119.5.13)' can't be established.
ECDSA key fingerprint is SHA256:dJrjrNy7AlqSTY9F15UIOyFNkt1URA+goQ63wXkcXpk.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'mattmenv-es-hdfs-data-000000,10.119.5.13' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 16.04.5 LTS (GNU/Linux 4.15.0-1021-azure x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

  Get cloud support with Ubuntu Advantage Cloud Guest:
    http://www.ubuntu.com/business/services/cloud

42 packages can be updated.
0 updates are security updates.

*** System restart required ***

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.

To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.

ubuntu@mattmenv-es-hdfs-data-000000:~$ curl http://localhost:18001/get_zk_count
^C
ubuntu@mattmenv-es-hdfs-data-000000:~$ curl http://mattmenv-mgr-000000:18001/get_zk_count
^C
ubuntu@mattmenv-es-hdfs-data-000000:~$ ping mattmenv-mgr-000000
PING mattmenv-mgr-000000.cg1q1ggmvwgubbgdaaq2dmb4ee.bx.internal.cloudapp.net (10.119.5.4) 56(84) bytes of data.
64 bytes from 10.119.5.4: icmp_seq=1 ttl=64 time=0.632 ms
64 bytes from 10.119.5.4: icmp_seq=2 ttl=64 time=0.795 ms
^C
--- mattmenv-mgr-000000.cg1q1ggmvwgubbgdaaq2dmb4ee.bx.internal.cloudapp.net ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 0.632/0.713/0.795/0.085 ms
ubuntu@mattmenv-es-hdfs-data-000000:~$ curl http://10.119.5.4:18001/get_zk_count
c^C
ubuntu@mattmenv-es-hdfs-data-000000:~$ ip route
default via 10.119.5.1 dev eth0
10.119.5.0/24 dev eth0  proto kernel  scope link  src 10.119.5.13
168.63.129.16 via 10.119.5.1 dev eth0
169.254.169.254 via 10.119.5.1 dev eth0
172.17.0.0/16 dev docker0  proto kernel  scope link  src 172.17.0.1
172.18.0.0/16 dev docker_gwbridge  proto kernel  scope link  src 172.18.0.1
ubuntu@mattmenv-es-hdfs-data-000000:~$ ip -4 -o addr
1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
2: eth0    inet 10.119.5.13/24 brd 10.119.5.255 scope global eth0\       valid_lft forever preferred_lft forever
3: docker0    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0\       valid_lft forever preferred_lft forever
24: docker_gwbridge    inet 172.18.0.1/16 brd 172.18.255.255 scope global docker_gwbridge\       valid_lft forever preferred_lft forever
ubuntu@mattmenv-es-hdfs-data-000000:~$

EDIT: I made the transcript code-style because parts of it were displaying in all different formats.

bboreham commented 5 years ago

Sounds similar to #3350 - can we get the Docker log from when you installed the Weave Net plugin ?

(Or maybe the log from last restart of Docker - really I mean "don't try to edit down the log to the bits you think are interesting")