flannel-io / flannel

flannel is a network fabric for containers, designed for Kubernetes
Apache License 2.0
8.8k stars 2.87k forks source link

flanneld can not start [VXLAN init: operation not supported] #441

Closed xidui closed 8 years ago

xidui commented 8 years ago

this is my console output

[root@kubernetes-slave-2 ~]# /usr/bin/flanneld -etcd-endpoints=http://node-1-master:2379 -etcd-prefix=/cluster.local/network
I0524 09:44:25.025576 08893 main.go:275] Installing signal handlers
I0524 09:44:25.025686 08893 main.go:130] Determining IP address of default interface
I0524 09:44:25.026547 08893 main.go:188] Using <my_ip> as external interface
I0524 09:44:25.026600 08893 main.go:189] Using <my_ip> as external endpoint
E0524 09:44:25.040006 08893 vxlan.go:104] VXLAN init: operation not supported
I0524 09:44:25.040060 08893 vxlan.go:105] Retrying in 1 second...
E0524 09:44:26.041475 08893 vxlan.go:104] VXLAN init: operation not supported
I0524 09:44:26.041528 08893 vxlan.go:105] Retrying in 1 second...
E0524 09:44:27.042943 08893 vxlan.go:104] VXLAN init: operation not supported
I0524 09:44:27.043009 08893 vxlan.go:105] Retrying in 1 second...
E0524 09:44:28.044456 08893 vxlan.go:104] VXLAN init: operation not supported
I0524 09:44:28.044509 08893 vxlan.go:105] Retrying in 1 second...

Is there some one have idea about it?

this is my version:

[root@kubernetes-slave-2 ~]# flanneld -version
0.5.3
tomdee commented 8 years ago

What operating system are you running? You need at least a 3.14+ kernel for VXLAN

xidui commented 8 years ago

@tomdee it is CentOS 7

xidui commented 8 years ago

@JoshuaAndrew

I tried CentOS 7 on digital ocean, where the deploy succeeded. But when I try CensOS 7 on linode, it failed and the output was as I described.

Seems that it is not the operation system thing.

xidui commented 8 years ago

The kernel version I think is high enough:

[root@kubernetes-slave-2 ~]#  uname -r 
4.5.3-x86_64-linode67
xidui commented 8 years ago

@tomdee I found the slave I succeeded start flanneld has a kernel version below 3.14. But with kernel 4.5.3, it failed? strange?

[root@kubernetes-slave ~]# uname -r
3.10.0-327.10.1.el7.x86_64
[root@kubernetes-slave ~]# systemctl status flanneld.service
● flanneld.service - Flanneld overlay address etcd agent
   Loaded: loaded (/usr/lib/systemd/system/flanneld.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2016-05-24 05:34:53 EDT; 21h ago
  Process: 14536 ExecStartPost=/usr/libexec/flannel/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker (code=exited, status=0/SUCCESS)
 Main PID: 14518 (flanneld)
   Memory: 4.4M
   CGroup: /system.slice/flanneld.service
           └─14518 /usr/bin/flanneld -etcd-endpoints=http://node-1-master:2379 -etcd-prefix=/cluster.local/network

May 25 03:02:02 kubernetes-slave flanneld[14518]: I0525 03:02:02.594997 14518 vxlan.go:340] Ignoring not a miss: 7e:b0:d1:ef:54:38, 172.16.92.0
May 25 03:02:03 kubernetes-slave flanneld[14518]: I0525 03:02:03.262913 14518 vxlan.go:340] Ignoring not a miss: 7e:b0:d1:ef:54:38, 172.16.92.0
May 25 03:02:04 kubernetes-slave flanneld[14518]: I0525 03:02:04.267140 14518 vxlan.go:340] Ignoring not a miss: 7e:b0:d1:ef:54:38, 172.16.92.0
May 25 03:02:05 kubernetes-slave flanneld[14518]: I0525 03:02:05.267023 14518 vxlan.go:340] Ignoring not a miss: 7e:b0:d1:ef:54:38, 172.16.92.0
May 25 03:02:10 kubernetes-slave flanneld[14518]: I0525 03:02:10.903122 14518 vxlan.go:340] Ignoring not a miss: 7e:b0:d1:ef:54:38, 172.16.92.3
May 25 03:02:20 kubernetes-slave flanneld[14518]: I0525 03:02:20.309652 14518 vxlan.go:345] L3 miss: 172.16.92.3
May 25 03:02:20 kubernetes-slave flanneld[14518]: I0525 03:02:20.309929 14518 device.go:187] calling NeighSet: 172.16.92.3, 7e:b0:d1:ef:54:38
May 25 03:02:20 kubernetes-slave flanneld[14518]: I0525 03:02:20.310253 14518 vxlan.go:356] AddL3 succeeded
May 25 03:02:38 kubernetes-slave flanneld[14518]: I0525 03:02:38.423138 14518 vxlan.go:340] Ignoring not a miss: 7e:b0:d1:ef:54:38, 172.16.92.0
May 25 03:02:47 kubernetes-slave flanneld[14518]: I0525 03:02:47.639118 14518 vxlan.go:340] Ignoring not a miss: 7e:b0:d1:ef:54:38, 172.16.92.3
xidui commented 8 years ago

I found that in my machine, the following command may fail:

[root@kubernetes-slave-2 ~]# ip link add vxlan0 type vxlan id 42 group 239.1.1.1 dev eth0 dstport 4789
RTNETLINK answers: Operation not supported

Seems that this is a kernel problem that the kernel does not support vxlan0.

The Operation not supported message came from here.

ghost commented 8 years ago

I use udp mode is ok, but vxlan mode is fail: I0525 17:49:17.972309 13374 local_manager.go:179] Picking subnet in range 10.1.1.0 ... 10.1.255.0 I0525 17:49:18.014720 13374 manager.go:246] Lease acquired: 10.1.37.0/24 I0525 17:49:18.015534 13374 network.go:58] Watching for L3 misses I0525 17:49:18.015737 13374 network.go:66] Watching for new subnet leases I0525 17:55:08.039796 13374 network.go:153] Handling initial subnet events I0525 17:55:08.039914 13374 device.go:159] calling GetL2List() dev.link.Index: 45 I0525 17:55:08.040156 13374 device.go:164] calling NeighAdd: 192.168.2.110, fe:da:99:67:06:a7 I0525 18:02:34.727390 13374 network.go:225] L3 miss: 10.1.42.2 I0525 18:02:34.727728 13374 device.go:187] calling NeighSet: 10.1.42.2, fe:da:99:67:06:a7 I0525 18:02:34.728465 13374 network.go:236] AddL3 succeeded I0525 18:03:09.456014 13374 network.go:220] Ignoring not a miss: fe:da:99:67:06:a7, 10.1.42.2

wanghl@wanghl-vm:~/mygo/src/github.com/coreos/flannel/bin$ lsmod |grep vxlan vxlan 37619 0 ip_tunnel 23768 1 vxlan

my system: Linux wanghl-vm 3.13.0-24-generic #46-Ubuntu SMP Thu Apr 10 19:11:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux (vmware)

ghost commented 8 years ago

I test vxlan on ubuntu 14.04(vmware) kernel 3.19 success

xidui commented 8 years ago

@JoshuaAndrew

xidui@kubernetes-slave-2:~$ lsmod
Module                  Size  Used by

The linode machine is empty.

xidui commented 8 years ago

I finally solved this issue by reinstall the kernel in linode machine. Guided by this link: https://www.linode.com/docs/tools-reference/custom-kernels-distros/run-a-distribution-supplied-kernel-with-kvm

xidui commented 8 years ago

Thanks all! You helped me a lot.

uschtwill commented 8 years ago

Thanks @xidui. I had to install a generic kernel, because my hosting provider's custom kernel didn't like vxlan.