coreos / bugs

Issue tracker for CoreOS Container Linux
https://coreos.com/os/eol/
146 stars 30 forks source link

coreos-vagrant doesn't provide a working flannel with virtualbox #2223

Open tomdee opened 6 years ago

tomdee commented 6 years ago

Issue Report

When installing container linux from the coreos-vagrant repo, it doens't provide a working flannel install.

This is because virtualbox installs two interfaces, one for outgoing traffic and one for internal traffic.

The first interface (eth0) gets the same IP address on all nodes, which means that flanneld gets the same lease on all nodes.

Bug

Container Linux Version

cat /etc/os-release 
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1576.1.0
VERSION_ID=1576.1.0
BUILD_ID=2017-10-26-0503
PRETTY_NAME="Container Linux by CoreOS 1576.1.0 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Environment

Virtualbox

Expected Behavior

flannel should work

Actual Behavior

Flannel doesn't work sinceall nodes share the same lease

Reproduction Steps

Use multiple nodes and the virtualbox provider

FrankReh commented 5 years ago

I ran into this problem and found flannel advice that related to this in https://github.com/coreos/flannel/blob/master/Documentation/running.md but because the advice was for how to update the cloud-config and seemed pre-ignition, I had a little bit left to figure out.

The advice was

Running on Vagrant

Vagrant has a tendency to give the default interface (one with the default route) a non-unique IP (often 10.0.2.15).

This causes flannel to register multiple nodes with the same IP.

(This is exactly what was happening with my MacOS/vagrant/virtualbox/CoreOS setup.)

To work around this issue, use --iface option to specify the interface that has a unique IP.

If you're running on CoreOS, use cloud-config to set coreos.flannel.interface to $public_ipv4.

So applying this advice to the cl.conf file, I added the interface option line for eth1 and got

flannel:
  etcd_prefix: "/flannel/network"
  interface: "eth1"

and this did the trick.

This change may only be valid for virtualbox as I don't know what the vmware VMs look like but a comment in the Vagrantfile would lead me to guess that eth1 is the interface with a unique IP there also. So maybe only the Flannel "Running on Vagrant" should be updated with a note about virtualbox and the Ignition input file at https://github.com/coreos/flannel/blob/master/Documentation/running.md#running-on-vagrant or maybe the cl.conf could be updated to include this line after https://github.com/coreos/coreos-vagrant/blob/08572747857bab12c4b9e81632063c4ccc94651d/cl.conf#L33