weaveworks / weave

Simple, resilient multi-host containers networking and more.
https://www.weave.works
Apache License 2.0
6.62k stars 670 forks source link

docker v2 plugin " No route to host" on digitalocean #3832

Closed jamshid closed 4 years ago

jamshid commented 4 years ago

What you expected to happen?

The weave v2 plugin works well on my home network with ubuntu 18 docker servers joined in a docker swarm cluster, but not in digitalocean. Weave DNS works across the two docker servers in digitalocean, but not actual traffic.

Can you let me know if digitalocean is not supported or has known problems before I try to troubleshoot further?

What happened?

I can start a container on the weave network on both the swarm manager and worker nodes, and DNS resolves the container names, but traffic isn't working.

How to reproduce it?

I create two docker servers. I install and configure the weave plugin (with multicast and encryption enabled) on both docker servers. Then I join them in a swarm cluster. (This is the same thing I did on my home network which works).

Manager:

docker-machine create --driver digitalocean --digitalocean-access-token=SECRET --digitalocean-size=8gb --digitalocean-image ubuntu-18-04-x64 atlantic

docker plugin install weaveworks/net-plugin:latest_release && docker plugin disable weaveworks/net-plugin:latest_release && docker plugin set weaveworks/net-plugin:latest_release WEAVE_PASSWORD=SECRET WEAVE_MULTICAST=1 && docker plugin enable weaveworks/net-plugin:latest_release

docker swarm init --advertise-addr 167.71.X.Y

Worker:

docker-machine create --driver digitalocean --digitalocean-access-token=SECRET --digitalocean-size=8gb --digitalocean-image ubuntu-18-04-x64 atlantic2

docker plugin install weaveworks/net-plugin:latest_release && docker plugin disable weaveworks/net-plugin:latest_release && docker plugin set weaveworks/net-plugin:latest_release WEAVE_PASSWORD=SECRET WEAVE_MULTICAST=1 && docker plugin enable weaveworks/net-plugin:latest_release

docker swarm join --token SWMTKN-1-0vlialgn83ffwsu29u7olj8gd7bqbvc401pvcuzg130vbw81bn-X 167.71.X.Y:2377 --advertise-addr 167.172.X.Y

Then create a weave network on the manager:

docker network create --driver=weaveworks/net-plugin:latest_release --attachable weave

Then start a container on manager with netcat listening on a port:

docker run -ti --name weavetest  --network weave centos bash
[root@39cfa56cfa86 /]# yum install -y nc
[root@39cfa56cfa86 /]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
18: ethwe0@if19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1376 qdisc noqueue state UP group default 
    link/ether 2a:c4:45:62:3a:25 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.0.1.2/24 brd 10.0.1.255 scope global ethwe0
       valid_lft forever preferred_lft forever
20: eth0@if21: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:ac:12:00:03 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.18.0.3/16 brd 172.18.255.255 scope global eth0
       valid_lft forever preferred_lft forever

[root@39cfa56cfa86 /]# ip route
default via 172.18.0.1 dev eth0 
10.0.1.0/24 dev ethwe0 proto kernel scope link src 10.0.1.2 
172.18.0.0/16 dev eth0 proto kernel scope link src 172.18.0.3 
224.0.0.0/4 dev ethwe0 scope link 

[root@39cfa56cfa86 /]#  ip -4 -o addr
1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
18: ethwe0    inet 10.0.1.2/24 brd 10.0.1.255 scope global ethwe0\       valid_lft forever preferred_lft forever
20: eth0    inet 172.18.0.3/16 brd 172.18.255.255 scope global eth0\       valid_lft forever preferred_lft forever

[root@39cfa56cfa86 /]# nc -l 7777

Finally start a container on the worker. While it can resolve the container name weavetest it cannot reach it by name or ip.

docker run -ti --name remoteweave  --network weave centos bash

[root@870634f32b31 /]# ping weavetest
PING weavetest (10.0.1.2) 56(84) bytes of data.
From 870634f32b31 (10.0.1.5) icmp_seq=1 Destination Host Unreachable
From 870634f32b31 (10.0.1.5) icmp_seq=2 Destination Host Unreachable
From 870634f32b31 (10.0.1.5) icmp_seq=3 Destination Host Unreachable

[root@870634f32b31 /]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
22: ethwe0@if23: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1376 qdisc noqueue state UP group default 
    link/ether 7a:e4:0f:60:64:5e brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.0.1.5/24 brd 10.0.1.255 scope global ethwe0
       valid_lft forever preferred_lft forever
24: eth0@if25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:ac:12:00:03 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.18.0.3/16 brd 172.18.255.255 scope global eth0
       valid_lft forever preferred_lft forever

[root@870634f32b31 /]# ip route
default via 172.18.0.1 dev eth0 
10.0.1.0/24 dev ethwe0 proto kernel scope link src 10.0.1.5 
172.18.0.0/16 dev eth0 proto kernel scope link src 172.18.0.3 
224.0.0.0/4 dev ethwe0 scope link 

[root@870634f32b31 /]#  ip -4 -o addr
1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
22: ethwe0    inet 10.0.1.5/24 brd 10.0.1.255 scope global ethwe0\       valid_lft forever preferred_lft forever
24: eth0    inet 172.18.0.3/16 brd 172.18.255.255 scope global eth0\       valid_lft forever preferred_lft forever

[root@870634f32b31 /]#  curl 10.0.1.2:7777
curl: (7) Failed to connect to 10.0.1.2 port 7777: No route to host
[root@870634f32b31 /]# curl -i weavetest:7777
curl: (7) Failed to connect to weavetest port 7777: No route to host

Btw doing this curl from a container on the manager works fine.

Anything else we need to know?

Digitalocean, all commands I used listed above.

Versions:

$ weave version
weave script 2.6.2
weave 2.6.5
jambook:dockerfiles jamshid$ docker version
Client: Docker Engine - Community
 Version:           19.03.8
 API version:       1.40
 Go version:        go1.12.17
 Git commit:        afacb8b
 Built:             Wed Mar 11 01:21:11 2020
 OS/Arch:           darwin/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.12
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.10
  Git commit:       48a66213fe
  Built:            Mon Jun 22 15:44:07 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.13
  GitCommit:        7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683
 Kubernetes:
  Version:          v1.16.6-beta.0
  StackAPI:         v1beta2

root@atlantic:~# uname -a
Linux atlantic 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

root@atlantic2:~# uname -a
Linux atlantic2 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Logs:

$ docker logs weave
N/A this is plugin

# MANAGER weave status:
$ weave status

        Version: 2.6.5 (up to date; next check at 2020/07/22 21:21:07)

        Service: router
       Protocol: weave 1..2
           Name: f2:ad:a8:14:28:7d(atlantic)
     Encryption: enabled
  PeerDiscovery: enabled
        Targets: 0
    Connections: 0
          Peers: 1
 TrustedSubnets: none

        Service: ipam
         Status: idle
          Range: 10.32.0.0/12
  DefaultSubnet: 10.32.0.0/12

        Service: plugin (v2)

# WORKER weave status:
$ weave status

        Version: 2.6.5 (up to date; next check at 2020/07/22 22:36:44)

        Service: router
       Protocol: weave 1..2
           Name: 56:78:d4:bf:44:08(atlantic2)
     Encryption: enabled
  PeerDiscovery: enabled
        Targets: 0
    Connections: 0
          Peers: 1
 TrustedSubnets: none

        Service: ipam
         Status: idle
          Range: 10.32.0.0/12
  DefaultSubnet: 10.32.0.0/12

        Service: plugin (v2)

Network:

These commands within the container are above. Here are these commands run from the docker servers.

MANAGER:

root@atlantic:~#  ip route
default via 167.71.240.1 dev eth0 proto static 
10.17.0.0/16 dev eth0 proto kernel scope link src 10.17.0.5 
167.71.240.0/20 dev eth0 proto kernel scope link src 167.71.247.124 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 
172.18.0.0/16 dev docker_gwbridge proto kernel scope link src 172.18.0.1 
root@atlantic:~#  ip -4 -o addr
1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
2: eth0    inet 167.71.247.124/20 brd 167.71.255.255 scope global eth0\       valid_lft forever preferred_lft forever
2: eth0    inet 10.17.0.5/16 brd 10.17.255.255 scope global eth0\       valid_lft forever preferred_lft forever
10: docker0    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0\       valid_lft forever preferred_lft forever
15: docker_gwbridge    inet 172.18.0.1/16 brd 172.18.255.255 scope global docker_gwbridge\       valid_lft forever preferred_lft forever
root@atlantic:~# sudo iptables-save
# Generated by iptables-save v1.6.1 on Wed Jul 22 17:50:22 2020
*mangle
:PREROUTING ACCEPT [32574:135854087]
:INPUT ACCEPT [29918:127051599]
:FORWARD ACCEPT [2656:8802488]
:OUTPUT ACCEPT [18353:1919394]
:POSTROUTING ACCEPT [21009:10721882]
:WEAVE-IPSEC-IN - [0:0]
:WEAVE-IPSEC-IN-MARK - [0:0]
:WEAVE-IPSEC-OUT - [0:0]
:WEAVE-IPSEC-OUT-MARK - [0:0]
-A INPUT -j WEAVE-IPSEC-IN
-A OUTPUT -j WEAVE-IPSEC-OUT
-A WEAVE-IPSEC-IN-MARK -j MARK --set-xmark 0x20000/0x20000
-A WEAVE-IPSEC-OUT-MARK -j MARK --set-xmark 0x20000/0x20000
COMMIT
# Completed on Wed Jul 22 17:50:22 2020
# Generated by iptables-save v1.6.1 on Wed Jul 22 17:50:22 2020
*nat
:PREROUTING ACCEPT [989:54150]
:INPUT ACCEPT [940:50998]
:OUTPUT ACCEPT [539:33747]
:POSTROUTING ACCEPT [540:33807]
:DOCKER - [0:0]
:WEAVE - [0:0]
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 172.18.0.0/16 ! -o docker_gwbridge -j MASQUERADE
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A POSTROUTING -j WEAVE
-A DOCKER -i docker_gwbridge -j RETURN
-A DOCKER -i docker0 -j RETURN
COMMIT
# Completed on Wed Jul 22 17:50:22 2020
# Generated by iptables-save v1.6.1 on Wed Jul 22 17:50:22 2020
*filter
:INPUT ACCEPT [8329:813213]
:FORWARD DROP [0:0]
:OUTPUT ACCEPT [7212:868792]
:DOCKER - [0:0]
:DOCKER-ISOLATION-STAGE-1 - [0:0]
:DOCKER-ISOLATION-STAGE-2 - [0:0]
:DOCKER-USER - [0:0]
:WEAVE-EXPOSE - [0:0]
:WEAVE-IPSEC-IN - [0:0]
-A INPUT -d 127.0.0.1/32 -p tcp -m tcp --dport 6784 -m addrtype ! --src-type LOCAL -m conntrack ! --ctstate RELATED,ESTABLISHED -m comment --comment "Block non-local access to Weave Net control port" -j DROP
-A INPUT -j WEAVE-IPSEC-IN
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o docker_gwbridge -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker_gwbridge -j DOCKER
-A FORWARD -i docker_gwbridge ! -o docker_gwbridge -j ACCEPT
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A FORWARD -i weave -o weave -j ACCEPT
-A FORWARD -o weave -j WEAVE-EXPOSE
-A FORWARD -i weave ! -o weave -j ACCEPT
-A FORWARD -o weave -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i docker_gwbridge -o docker_gwbridge -j DROP
-A OUTPUT ! -p esp -m policy --dir out --pol none -m mark --mark 0x20000/0x20000 -j DROP
-A DOCKER-ISOLATION-STAGE-1 -i docker_gwbridge ! -o docker_gwbridge -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o docker_gwbridge -j DROP
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
-A DOCKER-USER -j RETURN
COMMIT
# Completed on Wed Jul 22 17:50:22 2020

WORKER:

root@atlantic2:~# ip route
default via 167.172.224.1 dev eth0 proto static 
10.17.0.0/16 dev eth0 proto kernel scope link src 10.17.0.6 
167.172.224.0/20 dev eth0 proto kernel scope link src 167.172.226.42 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 
172.18.0.0/16 dev docker_gwbridge proto kernel scope link src 172.18.0.1 
root@atlantic2:~#  ip -4 -o addr
1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
2: eth0    inet 167.172.226.42/20 brd 167.172.239.255 scope global eth0\       valid_lft forever preferred_lft forever
2: eth0    inet 10.17.0.6/16 brd 10.17.255.255 scope global eth0\       valid_lft forever preferred_lft forever
10: docker0    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0\       valid_lft forever preferred_lft forever
15: docker_gwbridge    inet 172.18.0.1/16 brd 172.18.255.255 scope global docker_gwbridge\       valid_lft forever preferred_lft forever
root@atlantic2:~# sudo iptables-save
# Generated by iptables-save v1.6.1 on Wed Jul 22 17:48:54 2020
*mangle
:PREROUTING ACCEPT [20925:76927378]
:INPUT ACCEPT [20925:76927378]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [18403:1965498]
:POSTROUTING ACCEPT [18403:1965498]
:WEAVE-IPSEC-IN - [0:0]
:WEAVE-IPSEC-IN-MARK - [0:0]
:WEAVE-IPSEC-OUT - [0:0]
:WEAVE-IPSEC-OUT-MARK - [0:0]
-A INPUT -j WEAVE-IPSEC-IN
-A OUTPUT -j WEAVE-IPSEC-OUT
-A WEAVE-IPSEC-IN-MARK -j MARK --set-xmark 0x20000/0x20000
-A WEAVE-IPSEC-OUT-MARK -j MARK --set-xmark 0x20000/0x20000
COMMIT
# Completed on Wed Jul 22 17:48:54 2020
# Generated by iptables-save v1.6.1 on Wed Jul 22 17:48:54 2020
*nat
:PREROUTING ACCEPT [1198:65266]
:INPUT ACCEPT [1198:65266]
:OUTPUT ACCEPT [473:28952]
:POSTROUTING ACCEPT [473:28952]
:DOCKER - [0:0]
:WEAVE - [0:0]
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 172.18.0.0/16 ! -o docker_gwbridge -j MASQUERADE
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A POSTROUTING -j WEAVE
-A DOCKER -i docker_gwbridge -j RETURN
-A DOCKER -i docker0 -j RETURN
COMMIT
# Completed on Wed Jul 22 17:48:54 2020
# Generated by iptables-save v1.6.1 on Wed Jul 22 17:48:54 2020
*filter
:INPUT ACCEPT [915:92346]
:FORWARD DROP [0:0]
:OUTPUT ACCEPT [935:102778]
:DOCKER - [0:0]
:DOCKER-ISOLATION-STAGE-1 - [0:0]
:DOCKER-ISOLATION-STAGE-2 - [0:0]
:DOCKER-USER - [0:0]
:WEAVE-EXPOSE - [0:0]
:WEAVE-IPSEC-IN - [0:0]
-A INPUT -d 127.0.0.1/32 -p tcp -m tcp --dport 6784 -m addrtype ! --src-type LOCAL -m conntrack ! --ctstate RELATED,ESTABLISHED -m comment --comment "Block non-local access to Weave Net control port" -j DROP
-A INPUT -j WEAVE-IPSEC-IN
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o docker_gwbridge -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker_gwbridge -j DOCKER
-A FORWARD -i docker_gwbridge ! -o docker_gwbridge -j ACCEPT
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A FORWARD -i weave -o weave -j ACCEPT
-A FORWARD -o weave -j WEAVE-EXPOSE
-A FORWARD -i weave ! -o weave -j ACCEPT
-A FORWARD -o weave -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i docker_gwbridge -o docker_gwbridge -j DROP
-A OUTPUT ! -p esp -m policy --dir out --pol none -m mark --mark 0x20000/0x20000 -j DROP
-A DOCKER-ISOLATION-STAGE-1 -i docker_gwbridge ! -o docker_gwbridge -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o docker_gwbridge -j DROP
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
-A DOCKER-USER -j RETURN
COMMIT
# Completed on Wed Jul 22 17:48:54 2020
jamshid commented 4 years ago

Sorry, I thought I attached the journalctl -u docker logs but don't see them. Here they are: Manager: atlantic.log Worker: atlantic2.log

jamshid commented 4 years ago

Same behavior on azure. Does the weave v2 plugin not work in cloud environments? Some other setup is required? I see this old old 2015 blog that uses the legacy weave network on azure: https://www.weave.works/blog/microsoft-azure-docker-networking-ansible-weave/

bboreham commented 4 years ago

Yes it is supposed to work in the cloud. We test every commit with this script running at Google Cloud.

Looking at the logs it doesn't seem to have known to connect to any other nodes. Try creating the swarm cluster before installing the Weave Net plugin.

jamshid commented 4 years ago

Thanks so much @bboreham that was exactly the problem. Once I made the worker node join the swarm cluster before installing the weave plugin I was able to use the weave network that was created on the manager. Verified in both digitalocean and azure. Great tech, the only container networking solution I know that handles multicast.