projectcalico / canal

Policy based networking for cloud native applications
717 stars 100 forks source link

vxlan_network.go:158] failed to add vxlanRoute (10.244.0.0/24 -> 10.244.0.0): invalid argument #122

Closed slecrenski closed 6 years ago

slecrenski commented 6 years ago

Docker: 1.12.6 RHEL: 7.3 Linux k8s-master 3.10.0-693.21.1.el7.x86_64 #1 SMP Fri Feb 23 18:54:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux Kubernetes 1.9.3 quay.io/calico/node:v2.6.2 quay.io/calico/cni:v1.11.0 quay.io/coreos/flannel:v0.9.1

Azure Cloud with vnet address space: 10.244.0.0/16

net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }

Cluster was initialized with

kubeadm init --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=cri
#applied canal networking which runs flannel
kubectl scale deployment kube-dns -n kube-system --replicas=2
#this attempted to launch kube-dns on the agent node.
#the kube-dns container on the agent node never runs because it is unable to communicate to the master node to determine the dns configuration.
[root@k8s-master v2]# kubectl logs kube-dns-6f4fd4bdf-dsjtq -n kube-system -c kubedns
I0307 21:34:17.804073       1 dns.go:48] version: 1.14.6-3-gc36cb11
I0307 21:34:17.805197       1 server.go:69] Using configuration read from directory: /kube-dns-config with period 10s
I0307 21:34:17.805254       1 server.go:112] FLAG: --alsologtostderr="false"
I0307 21:34:17.805264       1 server.go:112] FLAG: --config-dir="/kube-dns-config"
I0307 21:34:17.805271       1 server.go:112] FLAG: --config-map=""
I0307 21:34:17.805277       1 server.go:112] FLAG: --config-map-namespace="kube-system"
I0307 21:34:17.805283       1 server.go:112] FLAG: --config-period="10s"
I0307 21:34:17.805290       1 server.go:112] FLAG: --dns-bind-address="0.0.0.0"
I0307 21:34:17.805296       1 server.go:112] FLAG: --dns-port="10053"
I0307 21:34:17.805303       1 server.go:112] FLAG: --domain="cluster.local."
I0307 21:34:17.805311       1 server.go:112] FLAG: --federations=""
I0307 21:34:17.805318       1 server.go:112] FLAG: --healthz-port="8081"
I0307 21:34:17.805324       1 server.go:112] FLAG: --initial-sync-timeout="1m0s"
I0307 21:34:17.805330       1 server.go:112] FLAG: --kube-master-url=""
I0307 21:34:17.805336       1 server.go:112] FLAG: --kubecfg-file=""
I0307 21:34:17.805342       1 server.go:112] FLAG: --log-backtrace-at=":0"
I0307 21:34:17.805350       1 server.go:112] FLAG: --log-dir=""
I0307 21:34:17.805356       1 server.go:112] FLAG: --log-flush-frequency="5s"
I0307 21:34:17.805362       1 server.go:112] FLAG: --logtostderr="true"
I0307 21:34:17.805368       1 server.go:112] FLAG: --nameservers=""
I0307 21:34:17.805374       1 server.go:112] FLAG: --stderrthreshold="2"
I0307 21:34:17.805390       1 server.go:112] FLAG: --v="2"
I0307 21:34:17.805396       1 server.go:112] FLAG: --version="false"
I0307 21:34:17.805415       1 server.go:112] FLAG: --vmodule=""
I0307 21:34:17.805466       1 server.go:194] Starting SkyDNS server (0.0.0.0:10053)
I0307 21:34:17.805656       1 server.go:213] Skydns metrics enabled (/metrics:10055)
I0307 21:34:17.805677       1 dns.go:146] Starting endpointsController
I0307 21:34:17.805683       1 dns.go:149] Starting serviceController
I0307 21:34:17.805805       1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0307 21:34:17.805826       1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0307 21:34:18.306107       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:18.806137       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:19.305925       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:19.805901       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:20.305909       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:20.805936       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:21.305954       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:21.805893       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:22.305926       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:22.806025       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:23.305962       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:23.805877       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:24.305931       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:24.805906       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:25.305905       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:25.806023       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:26.305906       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:26.806023       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:27.305930       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:27.805968       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:28.305904       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:28.805886       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:29.305877       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:29.805878       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:30.305896       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:30.805966       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:31.305877       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:31.805931       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:32.305950       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:32.805986       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:33.305935       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:33.805899       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:34.305962       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:34.806082       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:35.305918       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:35.805870       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:36.305998       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:36.805920       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:37.305936       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:37.805872       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:38.306083       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:38.806080       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:39.305912       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:39.805896       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:40.306024       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:40.805991       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:41.305912       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:41.805891       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:42.305905       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:42.805873       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:43.305893       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:43.805927       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:44.305913       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:44.806054       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:45.306072       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:45.805924       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:46.305902       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:46.805889       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:47.305910       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
I0307 21:34:47.806071       1 dns.go:173] Waiting for services and endpoints to be initialized from apiserver...
E0307 21:34:47.806907       1 reflector.go:201] k8s.io/dns/pkg/dns/dns.go:150: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E0307 21:34:47.807363       1 reflector.go:201] k8s.io/dns/pkg/dns/dns.go:147: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

Cluster is running in Azure with the same virtual network as the pod cidr.

[root@k8s-master v2]# kubectl get pods -n kube-system -o wide
NAME                                 READY     STATUS             RESTARTS   AGE       IP             NODE
canal-9sfh5                          3/3       Running            0          1h        10.244.0.4     k8s-agent1
canal-jmgzn                          3/3       Running            0          1h        10.244.0.100   k8s-master
etcd-k8s-master                      1/1       Running            0          2h        10.244.0.100   k8s-master
kube-apiserver-k8s-master            1/1       Running            0          2h        10.244.0.100   k8s-master
kube-controller-manager-k8s-master   1/1       Running            0          2h        10.244.0.100   k8s-master
kube-dns-6f4fd4bdf-ch98b             3/3       Running            0          2h        10.244.0.24    k8s-master
kube-dns-6f4fd4bdf-dsjtq             1/3       CrashLoopBackOff   48         1h        10.244.3.2     k8s-agent1
kube-proxy-x6j8p                     1/1       Running            0          2h        10.244.0.100   k8s-master
kube-proxy-z5bbd                     1/1       Running            0          1h        10.244.0.4     k8s-agent1
kube-scheduler-k8s-master            1/1       Running            0          2h        10.244.0.100   k8s-master
[root@k8s-master v2]# kubectl describe node k8s-master
Name:               k8s-master
Roles:              master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=Standard_DS2_v2
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=usgovvirginia
                    failure-domain.beta.kubernetes.io/zone=1
                    kubernetes.io/hostname=k8s-master
                    node-role.kubernetes.io/master=
Annotations:        flannel.alpha.coreos.com/backend-data={"VtepMAC":"92:b2:1f:03:ff:99"}
                    flannel.alpha.coreos.com/backend-type=vxlan
                    flannel.alpha.coreos.com/kube-subnet-manager=true
                    flannel.alpha.coreos.com/public-ip=10.244.0.100
                    node.alpha.kubernetes.io/ttl=0
                    volumes.kubernetes.io/controller-managed-attach-detach=true
Taints:             node-role.kubernetes.io/master:NoSchedule
CreationTimestamp:  Wed, 07 Mar 2018 19:25:44 +0000
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  OutOfDisk        False   Wed, 07 Mar 2018 21:31:45 +0000   Wed, 07 Mar 2018 19:25:39 +0000   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure   False   Wed, 07 Mar 2018 21:31:45 +0000   Wed, 07 Mar 2018 19:25:39 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Wed, 07 Mar 2018 21:31:45 +0000   Wed, 07 Mar 2018 19:25:39 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  Ready            True    Wed, 07 Mar 2018 21:31:45 +0000   Wed, 07 Mar 2018 19:26:55 +0000   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:  10.244.0.100
  Hostname:    k8s-master
Capacity:
 cpu:     2
 memory:  7125792Ki
 pods:    110
Allocatable:
 cpu:     2
 memory:  7023392Ki
 pods:    110
System Info:
 Machine ID:                 aa4f0681ccb6435784669b356fa73d9c
 System UUID:                2E21AA4F-77BB-F640-990D-12267E1262C0
 Boot ID:                    12d0a064-2b59-411d-8d64-d9c2a61472f0
 Kernel Version:             3.10.0-693.21.1.el7.x86_64
 OS Image:                   Red Hat Enterprise Linux Server 7.4 (Maipo)
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://1.12.6
 Kubelet Version:            v1.9.3
 Kube-Proxy Version:         v1.9.3
PodCIDR:                     10.244.0.0/24
ExternalID:                  /subscriptions/28865b6d-f25c-4bba-a4f1-a16bfa782571/resourceGroups/kubernetes/providers/Microsoft.Compute/virtualMachines/k8s-master
Non-terminated Pods:         (7 in total)
  Namespace                  Name                                  CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ---------                  ----                                  ------------  ----------  ---------------  -------------
  kube-system                canal-jmgzn                           250m (12%)    0 (0%)      0 (0%)           0 (0%)
  kube-system                etcd-k8s-master                       0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                kube-apiserver-k8s-master             250m (12%)    0 (0%)      0 (0%)           0 (0%)
  kube-system                kube-controller-manager-k8s-master    200m (10%)    0 (0%)      0 (0%)           0 (0%)
  kube-system                kube-dns-6f4fd4bdf-ch98b              260m (13%)    0 (0%)      110Mi (1%)       170Mi (2%)
  kube-system                kube-proxy-x6j8p                      0 (0%)        0 (0%)      0 (0%)           0 (0%)
  kube-system                kube-scheduler-k8s-master             100m (5%)     0 (0%)      0 (0%)           0 (0%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ------------  ----------  ---------------  -------------
  1060m (53%)   0 (0%)      110Mi (1%)       170Mi (2%)
Events:         <none>
[root@k8s-master v2]# kubectl describe node k8s-agent1
Name:               k8s-agent1
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=Standard_DS2_v2
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=usgovvirginia
                    failure-domain.beta.kubernetes.io/zone=0
                    kubernetes.io/hostname=k8s-agent1
Annotations:        flannel.alpha.coreos.com/backend-data={"VtepMAC":"8a:50:80:d4:48:ec"}
                    flannel.alpha.coreos.com/backend-type=vxlan
                    flannel.alpha.coreos.com/kube-subnet-manager=true
                    flannel.alpha.coreos.com/public-ip=10.244.0.4
                    node.alpha.kubernetes.io/ttl=0
                    volumes.kubernetes.io/controller-managed-attach-detach=true
Taints:             <none>
CreationTimestamp:  Wed, 07 Mar 2018 19:37:25 +0000
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  OutOfDisk        False   Wed, 07 Mar 2018 21:30:51 +0000   Wed, 07 Mar 2018 19:37:25 +0000   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure   False   Wed, 07 Mar 2018 21:30:51 +0000   Wed, 07 Mar 2018 19:37:25 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Wed, 07 Mar 2018 21:30:51 +0000   Wed, 07 Mar 2018 19:37:25 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  Ready            True    Wed, 07 Mar 2018 21:30:51 +0000   Wed, 07 Mar 2018 19:40:54 +0000   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:  10.244.0.4
  Hostname:    k8s-agent1
Capacity:
 cpu:     2
 memory:  7125792Ki
 pods:    110
Allocatable:
 cpu:     2
 memory:  7023392Ki
 pods:    110
System Info:
 Machine ID:                 aa4f0681ccb6435784669b356fa73d9c
 System UUID:                BEF56729-F758-5345-BBA8-536DF72C8981
 Boot ID:                    f0e31820-375e-40a0-81d4-f45720bc8222
 Kernel Version:             3.10.0-693.21.1.el7.x86_64
 OS Image:                   Red Hat Enterprise Linux Server 7.4 (Maipo)
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://1.12.6
 Kubelet Version:            v1.9.3
 Kube-Proxy Version:         v1.9.3
PodCIDR:                     10.244.3.0/24
ExternalID:                  /subscriptions/28865b6d-f25c-4bba-a4f1-a16bfa782571/resourceGroups/kubernetes/providers/Microsoft.Compute/virtualMachines/k8s-agent1
Non-terminated Pods:         (3 in total)
  Namespace                  Name                        CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ---------                  ----                        ------------  ----------  ---------------  -------------
  kube-system                canal-9sfh5                 250m (12%)    0 (0%)      0 (0%)           0 (0%)
  kube-system                kube-dns-6f4fd4bdf-dsjtq    260m (13%)    0 (0%)      110Mi (1%)       170Mi (2%)
  kube-system                kube-proxy-z5bbd            0 (0%)        0 (0%)      0 (0%)           0 (0%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ------------  ----------  ---------------  -------------
  510m (25%)    0 (0%)      110Mi (1%)       170Mi (2%)
Events:         <none>

I have a very basic configuration. 1 Master Node and 1 Agent Node. DNS queries are not working on the agent node. the kube-dns is running on the master node. Master Node IP 10.244.0.100 and Agent Node IP: 10.244.0.4.

I am trying to figure out why it is that I cannot communicate with 10.96.0.10 (kube-dns) which is supposed to be routed to the master node (where kube-dns is running).

I've been looking at log files and enabling level 10 verbosity for the past several hours. What does this error message mean?

vxlan_network.go:158] failed to add vxlanRoute (10.244.0.0/24 -> 10.244.0.0): invalid argument

I am unable to get pods that require kube-dns to run. They just fail with a dns error trying to perform a lookup to kubernetes.default.svc.cluster.local. If i try to scale kube-dns to launch on the non-master node the kube-dns fails to start on that node due to an issue with dns lookup.

I am unable to get pod-to-kube-dns and node->kube-dns communication working. How can I debug what the issue is?

These are RHEL 7.3 nodes with:

swapoff -a
setenforce 0
ip forwarding enabled
ip forwarding is enabled at the nic level in azure as well for both virtual machines.
systemctl stop firewalld
systemctl disable firewalld
kubectl get nodes -o yaml |grep flannel.alpha
flannel.alpha.coreos.com/backend-data: '{"VtepMAC":"4e:6d:96:f9:df:0b"}'
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: "true"
flannel.alpha.coreos.com/public-ip: 10.244.0.4
flannel.alpha.coreos.com/backend-data: '{"VtepMAC":"22:88:e0:e0:ee:9c"}'
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: "true"
flannel.alpha.coreos.com/public-ip: 10.244.0.100

Master Node:

nslookup kubernetes.default.svc.cluster.local 10.96.0.10
Server: 10.96.0.10
Address:    10.96.0.10#53

Name:   kubernetes.default.svc.cluster.local
Address: 10.96.0.1

Slave Node:

nslookup kubernetes.default.svc.cluster.local 10.96.0.10
;; connection timed out; trying next origin
;; connection timed out; no servers could be reached

Interestingly enough I can do things like this

[root@k8s-agent1 ~]# wget https://10.96.0.1:443/api/v1/nodes?resourceVersion=0
--2018-03-07 21:22:34--  https://10.96.0.1/api/v1/nodes?resourceVersion=0
Connecting to 10.96.0.1:443... connected.
ERROR: cannot verify 10.96.0.1's certificate, issued by ‘/CN=kubernetes’:
  Unable to locally verify the issuer's authority.
To connect to 10.96.0.1 insecurely, use `--no-check-certificate'.

Slave Node Iptables:

iptables-save | grep kube-dns
-A KUBE-SEP-LGXZUSYJZFXP55VS -s 10.244.0.20/32 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-MARK-MASQ
-A KUBE-SEP-LGXZUSYJZFXP55VS -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp" -m tcp -j DNAT --to-destination 10.244.0.20:53
-A KUBE-SEP-WHU5MQLF6I7CQ4PO -s 10.244.0.20/32 -m comment --comment "kube-system/kube-dns:dns" -j KUBE-MARK-MASQ
-A KUBE-SEP-WHU5MQLF6I7CQ4PO -p udp -m comment --comment "kube-system/kube-dns:dns" -m udp -j DNAT --to-destination 10.244.0.20:53
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-SEP-LGXZUSYJZFXP55VS
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m comment --comment "kube-system/kube-dns:dns" -j KUBE-SEP-WHU5MQLF6I7CQ4PO

Anyway to use tcpdump to figure out this issue?

What does this error mean? vxlan_network.go:158] failed to add vxlanRoute (10.244.0.0/24 -> 10.244.0.0): invalid argument

Master is running at 10.244.0.100 and agent node is running at 10.244.0.4.

--master

ip route
default via 10.244.0.1 dev eth0 proto static metric 100
10.244.0.0/16 dev eth0 proto kernel scope link src 10.244.0.100 metric 100
10.244.0.20 dev cali48fa6642c60 scope link
10.244.1.0/24 via 10.244.1.0 dev flannel.1 onlink
168.63.129.16 via 10.244.0.1 dev eth0 proto dhcp metric 100
169.254.169.254 via 10.244.0.1 dev eth0 proto dhcp metric 100
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1

--agent

default via 10.244.0.1 dev eth0 proto static metric 100
10.244.0.0/16 dev eth0 proto kernel scope link src 10.244.0.4 metric 100
10.244.1.20 dev cali66d03ab9413 scope link
168.63.129.16 via 10.244.0.1 dev eth0 proto dhcp metric 100
169.254.169.254 via 10.244.0.1 dev eth0 proto dhcp metric 100
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
kubectl logs canal-6dw5b -n kube-system -c kube-flannel
I0307 17:17:25.701615 1 main.go:474] Determining IP address of default interface
I0307 17:17:25.702468 1 main.go:487] Using interface with name eth0 and address 10.244.0.4
I0307 17:17:25.702485 1 main.go:504] Defaulting external address to interface address (10.244.0.4)
I0307 17:17:25.716693 1 kube.go:130] Waiting 10m0s for node controller to sync
I0307 17:17:25.716731 1 kube.go:283] Starting kube subnet manager
I0307 17:17:26.716969 1 kube.go:137] Node controller sync successful
I0307 17:17:26.716990 1 main.go:234] Created subnet manager: Kubernetes Subnet Manager - k8s-agent1
I0307 17:17:26.716997 1 main.go:237] Installing signal handlers
I0307 17:17:26.717072 1 main.go:352] Found network config - Backend type: vxlan
I0307 17:17:26.717123 1 vxlan.go:119] VXLAN config: VNI=1 Port=0 GBP=false DirectRouting=false
I0307 17:17:26.741083 1 main.go:299] Wrote subnet file to /run/flannel/subnet.env
I0307 17:17:26.741102 1 main.go:303] Running backend.
I0307 17:17:26.741112 1 main.go:321] Waiting for all goroutines to exit
I0307 17:17:26.741126 1 vxlan_network.go:56] watching for new subnet leases
E0307 17:17:26.742469 1 vxlan_network.go:158] failed to add vxlanRoute (10.244.0.0/24 -> 10.244.0.0): invalid argument
I0307 17:17:26.753958 1 iptables.go:114] Some iptables rules are missing; deleting and recreating rules
I0307 17:17:26.753975 1 iptables.go:136] Deleting iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -j RETURN
I0307 17:17:26.754326 1 iptables.go:114] Some iptables rules are missing; deleting and recreating rules
I0307 17:17:26.754340 1 iptables.go:136] Deleting iptables rule: -s 10.244.0.0/16 -j ACCEPT
I0307 17:17:26.755962 1 iptables.go:136] Deleting iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE
I0307 17:17:26.756540 1 iptables.go:136] Deleting iptables rule: -d 10.244.0.0/16 -j ACCEPT
I0307 17:17:26.758839 1 iptables.go:124] Adding iptables rule: -s 10.244.0.0/16 -j ACCEPT
I0307 17:17:26.759144 1 iptables.go:136] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.1.0/24 -j RETURN
I0307 17:17:26.762439 1 iptables.go:136] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE
I0307 17:17:26.762971 1 iptables.go:124] Adding iptables rule: -d 10.244.0.0/16 -j ACCEPT
I0307 17:17:26.765618 1 iptables.go:124] Adding iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -j RETURN
I0307 17:17:26.771098 1 iptables.go:124] Adding iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE
I0307 17:17:26.774826 1 iptables.go:124] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.1.0/24 -j RETURN
I0307 17:17:26.778155 1 iptables.go:124] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE
slecrenski commented 6 years ago

slave:

tcpdump -n udp dst portrange 8472
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
19:59:45.951924 IP 10.244.0.100.33274 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36460: Flags [S.], seq 3980964095, ack 1743480415, win 27960, options [mss 1410,sackOK,TS val 3518478 ecr 1223820,nop,wscale 7], length 0
19:59:45.952210 IP 10.244.0.100.58754 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36462: Flags [S.], seq 696635755, ack 2548921951, win 27960, options [mss 1410,sackOK,TS val 3518479 ecr 1223820,nop,wscale 7], length 0
19:59:46.953154 IP 10.244.0.100.58754 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36462: Flags [S.], seq 696635755, ack 2548921951, win 27960, options [mss 1410,sackOK,TS val 3519480 ecr 1223820,nop,wscale 7], length 0
19:59:46.954017 IP 10.244.0.100.33274 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36460: Flags [S.], seq 3980964095, ack 1743480415, win 27960, options [mss 1410,sackOK,TS val 3519480 ecr 1223820,nop,wscale 7], length 0
19:59:46.954049 IP 10.244.0.100.58754 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36462: Flags [S.], seq 696635755, ack 2548921951, win 27960, options [mss 1410,sackOK,TS val 3519480 ecr 1223820,nop,wscale 7], length 0
19:59:48.154188 IP 10.244.0.100.33274 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36460: Flags [S.], seq 3980964095, ack 1743480415, win 27960, options [mss 1410,sackOK,TS val 3520681 ecr 1223820,nop,wscale 7], length 0
19:59:48.954142 IP 10.244.0.100.58754 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36462: Flags [S.], seq 696635755, ack 2548921951, win 27960, options [mss 1410,sackOK,TS val 3521481 ecr 1223820,nop,wscale 7], length 0
19:59:48.955906 IP 10.244.0.100.33274 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36460: Flags [S.], seq 3980964095, ack 1743480415, win 27960, options [mss 1410,sackOK,TS val 3521482 ecr 1223820,nop,wscale 7], length 0
19:59:48.955933 IP 10.244.0.100.58754 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36462: Flags [S.], seq 696635755, ack 2548921951, win 27960, options [mss 1410,sackOK,TS val 3521482 ecr 1223820,nop,wscale 7], length 0
19:59:51.154189 IP 10.244.0.100.33274 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36460: Flags [S.], seq 3980964095, ack 1743480415, win 27960, options [mss 1410,sackOK,TS val 3523681 ecr 1223820,nop,wscale 7], length 0
19:59:52.963897 IP 10.244.0.100.58754 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36462: Flags [S.], seq 696635755, ack 2548921951, win 27960, options [mss 1410,sackOK,TS val 3525490 ecr 1223820,nop,wscale 7], length 0

master:

20:00:17.955855 IP 10.244.0.100.57275 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36542: Flags [S.], seq 2016186502, ack 98983334, win 27960, options [mss 1410,sackOK,TS val 3550482 ecr 1254821,nop,wscale 7], length 0
20:00:19.155120 IP 10.244.0.100.46515 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36540: Flags [S.], seq 3330915834, ack 368111801, win 27960, options [mss 1410,sackOK,TS val 3551682 ecr 1254820,nop,wscale 7], length 0
20:00:19.955223 IP 10.244.0.100.57275 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36542: Flags [S.], seq 2016186502, ack 98983334, win 27960, options [mss 1410,sackOK,TS val 3552482 ecr 1254821,nop,wscale 7], length 0
20:00:19.955792 IP 10.244.0.100.46515 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36540: Flags [S.], seq 3330915834, ack 368111801, win 27960, options [mss 1410,sackOK,TS val 3552482 ecr 1254820,nop,wscale 7], length 0
20:00:19.959782 IP 10.244.0.100.57275 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36542: Flags [S.], seq 2016186502, ack 98983334, win 27960, options [mss 1410,sackOK,TS val 3552486 ecr 1254821,nop,wscale 7], length 0
20:00:22.155192 IP 10.244.0.100.46515 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36540: Flags [S.], seq 3330915834, ack 368111801, win 27960, options [mss 1410,sackOK,TS val 3554682 ecr 1254820,nop,wscale 7], length 0
20:00:23.964762 IP 10.244.0.100.46515 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36540: Flags [S.], seq 3330915834, ack 368111801, win 27960, options [mss 1410,sackOK,TS val 3556491 ecr 1254820,nop,wscale 7], length 0
20:00:23.971974 IP 10.244.0.100.57275 > 10.244.0.4.otv: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.244.0.100.sun-sr-https > 10.244.3.2.36542: Flags [S.], seq 2016186502, ack 98983334, win 27960, options [mss 1410,sackOK,TS val 3556498 ecr 1254821,nop,wscale 7], length 0
slecrenski commented 6 years ago

Figured it out.

boogiefromzk commented 3 years ago

Figured it out.

What have helped?