weaveworks / weave

Simple, resilient multi-host containers networking and more.
https://www.weave.works
Apache License 2.0
6.62k stars 670 forks source link

Weave Net does not come up after joining k8s nodes within VirtualBox. #3363

Closed hhrutter closed 6 years ago

hhrutter commented 6 years ago

What you expected to happen?

After installation of a k8s cluster Weave should be up and running on all nodes after kubeadm join

What happened?

kubeadm init --apiserver-advertise-address=192.168.100.1 then I copied admin.conf to $HOME/.kube/config kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')" then i joined 2 nodes with:

kubeadm join 192.168.100.1:6443 --token qymtuk.q4gwfq32v4lrmrnx --discovery-token-ca-cert-hash sha256:1fb743d771a2dcd15b62150c1ef35d52ecc5d0498155334361aa2c965fa844ac

the 2 worker nodes don't come up:

$ kubectl get nodes
NAME      STATUS     ROLES     AGE       VERSION
master    Ready      master    1h        v1.11.1
node1     NotReady   <none>    1h        v1.11.1
node2     NotReady   <none>    1h        v1.11.1
$ kubectl get pods -n kube-system -o wide
NAME                             READY     STATUS             RESTARTS   AGE       IP              NODE
coredns-78fcdf6894-66tq2         1/1       Running            0          1h        10.32.0.3       master
coredns-78fcdf6894-n6xx9         1/1       Running            0          1h        10.32.0.2       master
etcd-master                      1/1       Running            0          1h        192.168.100.1   master
kube-apiserver-master            1/1       Running            0          1h        192.168.100.1   master
kube-controller-manager-master   1/1       Running            0          1h        192.168.100.1   master
kube-proxy-4zqc6                 1/1       Running            0          1h        192.168.100.1   master
kube-proxy-gcxk8                 1/1       Running            0          1h        192.168.100.3   node2
kube-proxy-stl7h                 1/1       Running            0          1h        192.168.100.2   node1
kube-scheduler-master            1/1       Running            0          1h        192.168.100.1   master
weave-net-4mkc5                  1/2       CrashLoopBackOff   16         1h        192.168.100.2   node1
weave-net-q76dv                  2/2       Running            0          1h        192.168.100.1   master
weave-net-qmnqj                  1/2       CrashLoopBackOff   16         1h        192.168.100.3   node2

How to reproduce it?

HostOS: MacOS 10.13.6 using VirtualBox 5.2.16 GuestOS. 3 x CentOS 7.5

master: 192.168.100.1

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:74:29:ae brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.1/24 brd 192.168.100.255 scope global enp0s3
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe74:29ae/64 scope link 
       valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:53:1c:e2 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.15/24 brd 10.0.3.255 scope global noprefixroute dynamic enp0s8
       valid_lft 80189sec preferred_lft 80189sec
    inet6 fe80::a00:27ff:fe53:1ce2/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
4: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 52:54:00:5f:8b:6d brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
       valid_lft forever preferred_lft forever
5: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 1000
    link/ether 52:54:00:5f:8b:6d brd ff:ff:ff:ff:ff:ff
6: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:27:b8:3d:e7 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
7: datapath: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1376 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 0e:b8:bf:6c:fa:00 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::cb8:bfff:fe6c:fa00/64 scope link 
       valid_lft forever preferred_lft forever
9: weave: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1376 qdisc noqueue state UP group default qlen 1000
    link/ether 4e:64:e2:23:7e:07 brd ff:ff:ff:ff:ff:ff
    inet 10.32.0.1/12 brd 10.47.255.255 scope global weave
       valid_lft forever preferred_lft forever
    inet6 fe80::4c64:e2ff:fe23:7e07/64 scope link 
       valid_lft forever preferred_lft forever
10: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 6e:5b:b5:ba:c8:6f brd ff:ff:ff:ff:ff:ff
12: vethwe-datapath@vethwe-bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1376 qdisc noqueue master datapath state UP group default 
    link/ether 86:c8:c2:10:98:70 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::84c8:c2ff:fe10:9870/64 scope link 
       valid_lft forever preferred_lft forever
13: vethwe-bridge@vethwe-datapath: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1376 qdisc noqueue master weave state UP group default 
    link/ether d6:01:29:6d:d9:0d brd ff:ff:ff:ff:ff:ff
    inet6 fe80::d401:29ff:fe6d:d90d/64 scope link 
       valid_lft forever preferred_lft forever
14: vxlan-6784: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc noqueue master datapath state UNKNOWN group default qlen 1000
    link/ether e2:ab:03:15:d7:a6 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::e0ab:3ff:fe15:d7a6/64 scope link 
       valid_lft forever preferred_lft forever
16: vethwepl74c3dd0@if15: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1376 qdisc noqueue master weave state UP group default 
    link/ether fe:d6:4e:9c:27:41 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::fcd6:4eff:fe9c:2741/64 scope link 
       valid_lft forever preferred_lft forever
18: vethwepl5346fb3@if17: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1376 qdisc noqueue master weave state UP group default 
    link/ether 56:fc:cb:7a:26:9b brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet6 fe80::54fc:cbff:fe7a:269b/64 scope link 
       valid_lft forever preferred_lft forever

node1: 192.168.100.2

$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:ed:91:cf brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.2/24 brd 192.168.100.255 scope global enp0s3
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:feed:91cf/64 scope link 
       valid_lft forever preferred_lft forever
3: enp0s9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:b0:a6:a8 brd ff:ff:ff:ff:ff:ff
    inet 10.0.4.15/24 brd 10.0.4.255 scope global noprefixroute dynamic enp0s9
       valid_lft 80271sec preferred_lft 80271sec
    inet6 fe80::2ff1:9c8c:1157:e470/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
4: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 52:54:00:5f:8b:6d brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
       valid_lft forever preferred_lft forever
5: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 1000
    link/ether 52:54:00:5f:8b:6d brd ff:ff:ff:ff:ff:ff
6: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:db:42:aa:33 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever

node2: 192.168.100.3

$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:14:cc:da brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.3/24 brd 192.168.100.255 scope global enp0s3
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe14:ccda/64 scope link 
       valid_lft forever preferred_lft forever
3: enp0s10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:9c:db:9e brd ff:ff:ff:ff:ff:ff
    inet 10.0.5.15/24 brd 10.0.5.255 scope global noprefixroute dynamic enp0s10
       valid_lft 80211sec preferred_lft 80211sec
    inet6 fe80::24a3:5098:2852:891f/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
4: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 52:54:00:5f:8b:6d brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
       valid_lft forever preferred_lft forever
5: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 1000
    link/ether 52:54:00:5f:8b:6d brd ff:ff:ff:ff:ff:ff
6: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:b8:b9:d7:84 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever

Anything else we need to know?

I can provide anything - just let me know.

Versions:

$ docker version
Client:
 Version:         1.13.1
 API version:     1.26
 Package version: docker-1.13.1-68.gitdded712.el7.centos.x86_64
 Go version:      go1.9.4
 Git commit:      dded712/1.13.1
 Built:           Tue Jul 17 18:34:48 2018
 OS/Arch:         linux/amd64

Server:
 Version:         1.13.1
 API version:     1.26 (minimum version 1.12)
 Package version: docker-1.13.1-68.gitdded712.el7.centos.x86_64
 Go version:      go1.9.4
 Git commit:      dded712/1.13.1
 Built:           Tue Jul 17 18:34:48 2018
 OS/Arch:         linux/amd64
 Experimental:    false
$ cat /etc/redhat-release
CentOS Linux release 7.5.1804 (Core) 
$ uname -a
Linux master 3.10.0-862.9.1.el7.x86_64 #1 SMP Mon Jul 16 16:29:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:53:20Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:43:26Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
$ systemctl status kubelet -l
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Sat 2018-07-28 23:04:45 CEST; 1h 25min ago
     Docs: http://kubernetes.io/docs/
 Main PID: 6089 (kubelet)
    Tasks: 16
   CGroup: /system.slice/kubelet.service
           └─6089 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=systemd --cni-bin-dir=/opt/cni/bin --cni-conf-dir=/etc/cni/net.d --network-plugin=cni

Jul 29 00:29:24 master kubelet[6089]: E0729 00:29:24.239511    6089 summary.go:102] Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service"
Jul 29 00:29:34 master kubelet[6089]: E0729 00:29:34.253179    6089 summary.go:102] Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service"
Jul 29 00:29:44 master kubelet[6089]: E0729 00:29:44.267224    6089 summary.go:102] Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service"
Jul 29 00:29:46 master kubelet[6089]: W0729 00:29:46.055203    6089 container_manager_linux.go:792] CPUAccounting not enabled for pid: 4301
Jul 29 00:29:46 master kubelet[6089]: W0729 00:29:46.055226    6089 container_manager_linux.go:795] MemoryAccounting not enabled for pid: 4301
Jul 29 00:29:46 master kubelet[6089]: W0729 00:29:46.055615    6089 container_manager_linux.go:792] CPUAccounting not enabled for pid: 6089
Jul 29 00:29:46 master kubelet[6089]: W0729 00:29:46.055624    6089 container_manager_linux.go:795] MemoryAccounting not enabled for pid: 6089
Jul 29 00:29:54 master kubelet[6089]: E0729 00:29:54.283487    6089 summary.go:102] Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service"
Jul 29 00:30:04 master kubelet[6089]: E0729 00:30:04.297750    6089 summary.go:102] Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service"
Jul 29 00:30:14 master kubelet[6089]: E0729 00:30:14.310231    6089 summary.go:102] Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service"

Logs:

$ kubectl get pods -n kube-system -o wide | grep weave-net
weave-net-4mkc5                  1/2       CrashLoopBackOff   14         58m       192.168.100.2   node1
weave-net-q76dv                  2/2       Running            0          1h        192.168.100.1   master
weave-net-qmnqj                  1/2       CrashLoopBackOff   14         58m       192.168.100.3   node2
$ kubectl exec -n kube-system weave-net-q76dv -c weave -- /home/weave/weave --local status

        Version: 2.4.0 (up to date; next check at 2018/07/29 02:57:17)

        Service: router
       Protocol: weave 1..2
           Name: 4e:64:e2:23:7e:07(master)
     Encryption: disabled
  PeerDiscovery: enabled
        Targets: 1
    Connections: 1 (1 failed)
          Peers: 1
 TrustedSubnets: none

        Service: ipam
         Status: ready
          Range: 10.32.0.0/12
  DefaultSubnet: 10.32.0.0/12
$ kubectl exec -n kube-system weave-net-4mkc5 -c weave -- /home/weave/weave --local status
error: unable to upgrade connection: container not found ("weave")
$ kubectl logs -n kube-system weave-net-q76dv weave
INFO: 2018/07/28 21:19:01.537057 Command line options: map[expect-npc:true nickname:master db-prefix:/weavedb/weave-net ipalloc-init:consensus=1 name:4e:64:e2:23:7e:07 conn-limit:100 datapath:datapath docker-api: host-root:/host http-addr:127.0.0.1:6784 ipalloc-range:10.32.0.0/12 metrics-addr:0.0.0.0:6782 no-dns:true port:6783]
INFO: 2018/07/28 21:19:01.537185 weave  2.4.0
INFO: 2018/07/28 21:19:01.538530 failed to create weave-test-comment6bcaa950; disabling comment support
INFO: 2018/07/28 21:19:01.812958 Bridge type is bridged_fastdp
INFO: 2018/07/28 21:19:01.812976 Communication between peers is unencrypted.
INFO: 2018/07/28 21:19:01.842972 Our name is 4e:64:e2:23:7e:07(master)
INFO: 2018/07/28 21:19:01.843113 Launch detected - using supplied peer list: [192.168.100.1]
INFO: 2018/07/28 21:19:01.843136 Checking for pre-existing addresses on weave bridge
INFO: 2018/07/28 21:19:01.941361 [allocator 4e:64:e2:23:7e:07] No valid persisted data
INFO: 2018/07/28 21:19:01.953059 [allocator 4e:64:e2:23:7e:07] Initialising via deferred consensus
INFO: 2018/07/28 21:19:01.953106 Sniffing traffic on datapath (via ODP)
INFO: 2018/07/28 21:19:01.970470 ->[192.168.100.1:6783] attempting connection
INFO: 2018/07/28 21:19:01.976769 Listening for HTTP control messages on 127.0.0.1:6784
INFO: 2018/07/28 21:19:01.976908 Listening for metrics requests on 0.0.0.0:6782
INFO: 2018/07/28 21:19:01.978754 ->[192.168.100.1:52662] connection accepted
INFO: 2018/07/28 21:19:01.979052 ->[192.168.100.1:52662|4e:64:e2:23:7e:07(master)]: connection shutting down due to error: cannot connect to ourself
INFO: 2018/07/28 21:19:01.979190 ->[192.168.100.1:6783|4e:64:e2:23:7e:07(master)]: connection shutting down due to error: cannot connect to ourself
INFO: 2018/07/28 21:19:02.606420 [kube-peers] Added myself to peer list &{[{4e:64:e2:23:7e:07 master}]}
DEBU: 2018/07/28 21:19:02.611265 [kube-peers] Nodes that have disappeared: map[]
10.32.0.1
$ kubectl logs -n kube-system weave-net-4mkc5 weave
FATA: 2018/07/28 22:18:53.116928 [kube-peers] Could not get peers: Get https://10.96.0.1:443/api/v1/nodes: dial tcp 10.96.0.1:443: i/o timeout
Failed to get peers
$ kubectl get events
LAST SEEN   FIRST SEEN   COUNT     NAME                      KIND      SUBOBJECT   TYPE      REASON                    SOURCE              MESSAGE
59m         59m          1         master.1545a51451586285   Node                  Normal    NodeReady                 kubelet, master     Node master status is now: NodeReady
56m         56m          1         node1.1545a5358c592468    Node                  Normal    Starting                  kubelet, node1      Starting kubelet.
56m         56m          2         node1.1545a53593266483    Node                  Normal    NodeHasSufficientDisk     kubelet, node1      Node node1 status is now: NodeHasSufficientDisk
56m         56m          2         node1.1545a5359326a70c    Node                  Normal    NodeHasSufficientMemory   kubelet, node1      Node node1 status is now: NodeHasSufficientMemory
56m         56m          2         node1.1545a5359326b6e7    Node                  Normal    NodeHasNoDiskPressure     kubelet, node1      Node node1 status is now: NodeHasNoDiskPressure
56m         56m          2         node1.1545a5359326c1ab    Node                  Normal    NodeHasSufficientPID      kubelet, node1      Node node1 status is now: NodeHasSufficientPID
56m         56m          1         node1.1545a5359e23f3d6    Node                  Normal    NodeAllocatableEnforced   kubelet, node1      Updated Node Allocatable limit across pods
56m         56m          1         node2.1545a53c8fa2286f    Node                  Normal    NodeAllocatableEnforced   kubelet, node2      Updated Node Allocatable limit across pods
56m         56m          1         node2.1545a53c7ebb47f8    Node                  Normal    Starting                  kubelet, node2      Starting kubelet.
56m         56m          2         node2.1545a53c851c3285    Node                  Normal    NodeHasSufficientDisk     kubelet, node2      Node node2 status is now: NodeHasSufficientDisk
56m         56m          2         node2.1545a53c851c6b0c    Node                  Normal    NodeHasSufficientMemory   kubelet, node2      Node node2 status is now: NodeHasSufficientMemory
56m         56m          2         node2.1545a53c851c7870    Node                  Normal    NodeHasNoDiskPressure     kubelet, node2      Node node2 status is now: NodeHasNoDiskPressure
56m         56m          2         node2.1545a53c851c854a    Node                  Normal    NodeHasSufficientPID      kubelet, node2      Node node2 status is now: NodeHasSufficientPID
53m         53m          1         node1.1545a55da395befc    Node                  Normal    Starting                  kube-proxy, node1   Starting kube-proxy.
52m         52m          1         node2.1545a56a75b2f369    Node                  Normal    Starting                  kube-proxy, node2   Starting kube-proxy.

Network:

$ ip route
default via 10.0.3.2 dev enp0s8 proto dhcp metric 100 
10.0.3.0/24 dev enp0s8 proto kernel scope link src 10.0.3.15 metric 100 
10.32.0.0/12 dev weave proto kernel scope link src 10.32.0.1 
169.254.0.0/16 dev enp0s3 scope link metric 1002 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 
192.168.100.0/24 dev enp0s3 proto kernel scope link src 192.168.100.1 
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 
$ iptables-save
# Generated by iptables-save v1.4.21 on Sun Jul 29 00:08:56 2018
*nat
:PREROUTING ACCEPT [3:180]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [1:60]
:POSTROUTING ACCEPT [1:60]
:DOCKER - [0:0]
:KUBE-MARK-DROP - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-NODEPORTS - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-SEP-3DU66DE6VORVEQVD - [0:0]
:KUBE-SEP-EMMXYKITEUOEZBH3 - [0:0]
:KUBE-SEP-S4MK5EVI7CLHCCS6 - [0:0]
:KUBE-SEP-SZZ7MOWKTWUFXIJT - [0:0]
:KUBE-SEP-UJJNLSZU6HL4F5UO - [0:0]
:KUBE-SERVICES - [0:0]
:KUBE-SVC-ERIFXISQEP7F7OF4 - [0:0]
:KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
:KUBE-SVC-TCOU7JCQXEZGVUNU - [0:0]
:WEAVE - [0:0]
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A POSTROUTING -j WEAVE
-A DOCKER -i docker0 -j RETURN
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE
-A KUBE-SEP-3DU66DE6VORVEQVD -s 10.32.0.3/32 -m comment --comment "kube-system/kube-dns:dns" -j KUBE-MARK-MASQ
-A KUBE-SEP-3DU66DE6VORVEQVD -p udp -m comment --comment "kube-system/kube-dns:dns" -m udp -j DNAT --to-destination 10.32.0.3:53
-A KUBE-SEP-EMMXYKITEUOEZBH3 -s 192.168.100.1/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-EMMXYKITEUOEZBH3 -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination 192.168.100.1:6443
-A KUBE-SEP-S4MK5EVI7CLHCCS6 -s 10.32.0.3/32 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-MARK-MASQ
-A KUBE-SEP-S4MK5EVI7CLHCCS6 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp" -m tcp -j DNAT --to-destination 10.32.0.3:53
-A KUBE-SEP-SZZ7MOWKTWUFXIJT -s 10.32.0.2/32 -m comment --comment "kube-system/kube-dns:dns" -j KUBE-MARK-MASQ
-A KUBE-SEP-SZZ7MOWKTWUFXIJT -p udp -m comment --comment "kube-system/kube-dns:dns" -m udp -j DNAT --to-destination 10.32.0.2:53
-A KUBE-SEP-UJJNLSZU6HL4F5UO -s 10.32.0.2/32 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-MARK-MASQ
-A KUBE-SEP-UJJNLSZU6HL4F5UO -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp" -m tcp -j DNAT --to-destination 10.32.0.2:53
-A KUBE-SERVICES -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
-A KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m comment --comment "kube-system/kube-dns:dns-tcp" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-UJJNLSZU6HL4F5UO
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-SEP-S4MK5EVI7CLHCCS6
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -j KUBE-SEP-EMMXYKITEUOEZBH3
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m comment --comment "kube-system/kube-dns:dns" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-SZZ7MOWKTWUFXIJT
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m comment --comment "kube-system/kube-dns:dns" -j KUBE-SEP-3DU66DE6VORVEQVD
-A WEAVE -s 10.32.0.0/12 -d 224.0.0.0/4 -j RETURN
-A WEAVE ! -s 10.32.0.0/12 -d 10.32.0.0/12 -j MASQUERADE
-A WEAVE -s 10.32.0.0/12 ! -d 10.32.0.0/12 -j MASQUERADE
COMMIT
# Completed on Sun Jul 29 00:08:56 2018
# Generated by iptables-save v1.4.21 on Sun Jul 29 00:08:56 2018
*filter
:INPUT ACCEPT [347:89351]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [358:95754]
:DOCKER - [0:0]
:DOCKER-ISOLATION - [0:0]
:KUBE-EXTERNAL-SERVICES - [0:0]
:KUBE-FIREWALL - [0:0]
:KUBE-FORWARD - [0:0]
:KUBE-SERVICES - [0:0]
:WEAVE-NPC - [0:0]
:WEAVE-NPC-DEFAULT - [0:0]
:WEAVE-NPC-EGRESS - [0:0]
:WEAVE-NPC-EGRESS-ACCEPT - [0:0]
:WEAVE-NPC-EGRESS-CUSTOM - [0:0]
:WEAVE-NPC-EGRESS-DEFAULT - [0:0]
:WEAVE-NPC-INGRESS - [0:0]
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A INPUT -j KUBE-FIREWALL
-A INPUT -i weave -j WEAVE-NPC-EGRESS
-A FORWARD -i weave -m comment --comment "NOTE: this must go before \'-j KUBE-FORWARD\'" -j WEAVE-NPC-EGRESS
-A FORWARD -o weave -m comment --comment "NOTE: this must go before \'-j KUBE-FORWARD\'" -j WEAVE-NPC
-A FORWARD -o weave -m state --state NEW -j NFLOG --nflog-group 86
-A FORWARD -o weave -j DROP
-A FORWARD -i weave ! -o weave -j ACCEPT
-A FORWARD -o weave -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A FORWARD -j DOCKER-ISOLATION
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL
-A DOCKER-ISOLATION -j RETURN
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x4000/0x4000 -j ACCEPT
-A WEAVE-NPC -m state --state RELATED,ESTABLISHED -j ACCEPT
-A WEAVE-NPC -d 224.0.0.0/4 -j ACCEPT
-A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-DEFAULT
-A WEAVE-NPC -m state --state NEW -j WEAVE-NPC-INGRESS
-A WEAVE-NPC -m set ! --match-set weave-local-pods dst -j ACCEPT
-A WEAVE-NPC-DEFAULT -m set --match-set weave-Rzff}h:=]JaaJl/G;(XJpGjZ[ dst -m comment --comment "DefaultAllow ingress isolation for namespace: kube-public" -j ACCEPT
-A WEAVE-NPC-DEFAULT -m set --match-set weave-;rGqyMIl1HN^cfDki~Z$3]6!N dst -m comment --comment "DefaultAllow ingress isolation for namespace: default" -j ACCEPT
-A WEAVE-NPC-DEFAULT -m set --match-set weave-P.B|!ZhkAr5q=XZ?3}tMBA+0 dst -m comment --comment "DefaultAllow ingress isolation for namespace: kube-system" -j ACCEPT
-A WEAVE-NPC-EGRESS -m state --state RELATED,ESTABLISHED -j ACCEPT
-A WEAVE-NPC-EGRESS -m state --state NEW -m set ! --match-set weave-local-pods src -j RETURN
-A WEAVE-NPC-EGRESS -d 224.0.0.0/4 -j RETURN
-A WEAVE-NPC-EGRESS -m state --state NEW -j WEAVE-NPC-EGRESS-DEFAULT
-A WEAVE-NPC-EGRESS -m state --state NEW -m mark ! --mark 0x40000/0x40000 -j WEAVE-NPC-EGRESS-CUSTOM
-A WEAVE-NPC-EGRESS -m state --state NEW -m mark ! --mark 0x40000/0x40000 -j NFLOG --nflog-group 86
-A WEAVE-NPC-EGRESS -m mark ! --mark 0x40000/0x40000 -j DROP
-A WEAVE-NPC-EGRESS-ACCEPT -j MARK --set-xmark 0x40000/0x40000
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-41s)5vQ^o/xWGz6a20N:~?#|E src -m comment --comment "DefaultAllow egress isolation for namespace: kube-public" -j WEAVE-NPC-EGRESS-ACCEPT
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-41s)5vQ^o/xWGz6a20N:~?#|E src -m comment --comment "DefaultAllow egress isolation for namespace: kube-public" -j RETURN
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-s_+ChJId4Uy_$}G;WdH|~TK)I src -m comment --comment "DefaultAllow egress isolation for namespace: default" -j WEAVE-NPC-EGRESS-ACCEPT
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-s_+ChJId4Uy_$}G;WdH|~TK)I src -m comment --comment "DefaultAllow egress isolation for namespace: default" -j RETURN
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-E1ney4o[ojNrLk.6rOHi;7MPE src -m comment --comment "DefaultAllow egress isolation for namespace: kube-system" -j WEAVE-NPC-EGRESS-ACCEPT
-A WEAVE-NPC-EGRESS-DEFAULT -m set --match-set weave-E1ney4o[ojNrLk.6rOHi;7MPE src -m comment --comment "DefaultAllow egress isolation for namespace: kube-system" -j RETURN
COMMIT
murali-reddy commented 6 years ago

$ kubectl logs -n kube-system weave-net-4mkc5 weave FATA: 2018/07/28 22:18:53.116928 [kube-peers] Could not get peers: Get https://10.96.0.1:443/api/v1/nodes: dial tcp 10.96.0.1:443: i/o timeout Failed to get peers

Weave pods are unable to reach Kubernetes API server through service proxy. Very likely you are running into routing issues and nothing related to Weave as such. Please take a look at below routes and see if this is what causing the issue.

192.168.100.0/24 dev enp0s3 proto kernel scope link src 192.168.100.1

Routing table indicate nodes are reachable through enp0s3

default via 10.0.3.2 dev enp0s8 proto dhcp metric 100

Since there is no explicit route for service IP range IP 10.96.0.0/12 are getting routed through enp0s8. My guess is when weave pods reach 10.96.0.1:443, service proxy DNAT's destination IP to 192.168.100.1 which is sent over enp0s3 which should result in packet drops on master node.

Please see that documentation https://kubernetes.io/docs/setup/independent/install-kubeadm/#check-network-adapters for the requirement and issue for similar context.

brb commented 6 years ago

I think the problem is that the DNAT translation in the Linux kernel happens after src IP addr is chosen for the packet. In your case, the master receives the request with src IP set to the one used as a src to the default gw (ip route get 10.96.0.1), and therefore the response is sent via enp0s8 instead of enp0s3 (or just dropped if the rp_filter policy is enabled) => the packet is lost.

To fix it, you can try adding a route on each worker node with ip route add 10.96.0.1/32 dev enp0s3 src $IP_ADDR_OF_enp0s3.

annismckenzie commented 6 years ago

In our own Vagrant based devbox setup (Saltstack based but that's just fyi) we've worked around a couple of quirks (which includes this issue) by:

  1. add --pod-network-cidr 10.32.0.0/12 to the kubeadm init command and use https://cloud.weave.works/k8s/net?k8s-version=v1.11.1&env.IPALLOC_RANGE=10.32.0.0/12 as the URL for fetching the weave.yaml or directly applying it like you did. What you pass doesn't really matter as long as it's the same range in both cases.

That'll fix the issue that the API server isn't reachable because the kube-proxy cannot distinguish between the pod IPs and the service IPs.

The next issue you'll run into with a Vagrant setup is probably that your nodes all report 10.0.2.15 as internal IP to the API server (do a kubectl describe node and look at the output). That can be fixed by choosing predictable worker IPs (for example starting with 192.168.100.10) and adding that as an extra argument to the kubelet's systemd configuration. For us Saltstack creates /etc/systemd/system/kubelet.service.d/20-kubelet-node-ip.conf and sets

[Service]
        Environment="KUBELET_EXTRA_ARGS=--node-ip=192.168.100.10"

as the content. A reload of the unit is required but it depends on what and when the configuration is created.

It took us a lot of time to debug these issues but we finally have a stable and reproducible multi node Vagrant cluster.

A final note about debugging the Weave pods. We saved the weave.yaml file instead of directly applying it and replaced the livenessProbe with a readinessProbe in the weave-net DaemonSet to stop K8s from constantly reaping the Weave pods. It doesn't change the outcome (node wasn't ready before and it's not after) but it makes debugging a lot easier.

murali-reddy commented 6 years ago

That'll fix the issue that the API server isn't reachable because the kube-proxy cannot distinguish between the pod IPs and the service IPs.

Specifying pod-network-cidr, which results in clusterCIDR getting specified for kube-proxy, may have been accidentally fixing the problem. Basically its masking the problem by MASQUERADE the traffic.

Weave net run in hostNetwork, so it is not going to get ip from pod CIDR anyway. In case of multiple interfaces its still important to establish proper routes so that traffic to service VIP's get router properly.

annismckenzie commented 6 years ago

You're right that this might be an accidental fix but for now it's easier to do than making sure ip route add 10.96.0.1/32 … is done on all nodes and is reboot safe regardless of the OS used for the cluster. As this is a Vagrant setup used for development and it does work I'm fine with it for now. Do you have a less brittle proposal?

bboreham commented 6 years ago

Maybe change your Vagrant setup so the default route is also the route to the api-server ?

There are some different suggestions in the lengthy thread at https://github.com/kubernetes/kubeadm/issues/102

I'm not aware of anyone attacking the fundamental issue that Linux doesn't reconsider its choice of source address after a DNAT.

hhrutter commented 6 years ago

Thanks for all the interesting input.

Although I am still trying to understand why I need this in my particular setup adding a static route on my worker nodes withip route add 10.96.0.1/32 dev enp0s3 fixed the problem and all nodes are up and running including the weave-net pods.

annismckenzie commented 6 years ago

How’d you make that reboot safe? The route will only be there temporarily.

On Aug 5, 2018, at 5:58 PM, Horst Rutter notifications@github.com wrote:

Closed #3363.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

hhrutter commented 6 years ago

I persisted the static routes on the worker nodes via network script like so: /etc/sysconfig/network-scripts/route-enp0s3: 10.96.0.1/32 via 192.168.100.1 dev enp0s3

schollii commented 3 years ago

In case this helps others, I had the exact same setup as described in the issue, but adding the route did not help. I fixed the problem by recreating the cluster but this time I specified the pod CIDR block in the kubeadm init with --pod-network-cidr (I chose 10.244.0.0/16 but you should be able to pick anything that doesn't overlap the other interfaces), and it worked. Specifying the pod CIDR is sufficient. cc/ @annismckenzie