Where are you running RancherOS? (docker-machine, AWS, GCE, baremetal, etc.)
Baremetal
I'm running into a network initialization loop when setting MTU via DHCP interface-mtu on a machine using the igb network driver. Same issue as seen in this CoreOS bug: https://github.com/coreos/bugs/issues/1827.
When the MTU is set by dhcpcd, the igb driver cycles the link. When the link goes down, it appears /lib/dhcpcd/dhcpcd-hooks/10-mtu restores the MTU to 1500. When the link comes back up, dhcpcd appears to detect this, and then sends out a new discover, sets the MTU to 9000, igb cycles the link, and the whole process repeats ad nauseam.
Various attempts at workarounds:
Not working: dhcpcd has the -K, --nolink option. Setting this via kernel cmdline rancher.network.interfaces.eth0.dhcp_args appears to cause dhcpcd to never attempt a discover on boot.
Working: Set interface-mtu to the default 1500 on the dhcpd server. Use a cloud-config to configure the interface statically and set the MTU to 9000.
Working and favorite so far: Use iPXE to statically configure the boot interface and disable dhcp, eg:
#!ipxe
set base http://1.2.3.4/rancher
set version 1.3.0
echo Now booting host: ${hostname}, MAC: ${mac}, IP: ${ip}, RancherOS ${version}
initrd ${base}/releases/${version}/initrd
kernel ${base}/releases/${version}/vmlinuz initrd=initrd console=tty0 console=ttyS0,115200 rancher.autologin=ttyS0 hostname=${hostname} -- rancher.network.interfaces.*.dhcp=false rancher.network.interfaces.mac#${mac}.address=${ip}/16 rancher.network.interfaces.mac#${mac}.mtu=${mtu} rancher.state.dev=LABEL=RANCHER_STATE rancher.state.autoformat=[/dev/sda] rancher.cloud_init.datasources=[url:${base}/cfg/ssh,url:${base}/cfg/network] rancher.debug=true
boot
Note, I had to use an alternate separator after mac instead of =. When using an equals, netconf would fail to parse to configuration and cause a panic on boot, eg:
[ ] init:info: [1/21] Starting preparefs WITH NIL cfg
[ ] init:info: [2/21] Starting save init cmdline WITH NIL cfg
[ ] init:error: EXITING: Failed to parse configuration: Invalid timestamp: '6c:ae:8b:3f:33:ea".mtu=9000' at line 94, column 14
[ ] init:error: EXTRA_CMDLINE: Additional property EXTRA_CMDLINE is not allowed
[ ] init:error: hostname: Additional property hostname is not allowed
[ ] init:error: autologin: Additional property autologin is not allowed
[ ] init:error: ssh_authorized_keys: Invalid type. Expected: array, given: null
It would be useful to add a test for netconf.findMatch with alternate separators after mac to make sure this functionality isn't accidentally removed in the future. It may also be useful for others to show an example of this in the docs.
The only issue with the iPXE approach currently is the iPXE variable ${netmask} is not in CIDR notation. netconf.applyAddress calls netlink.ParseAddr, which calls netlink.ParseIPNet. ParseIPNet passes the address and CIDR mask through net.ParseCIDR: https://github.com/vishvananda/netlink/blob/master/netlink.go#L26. Thus, when trying to use 255.255.0.0, network configuration fails with:
network_1 | [ ] netconf:error: Failed to apply address 1.2.3.5/255.255.0.0 to eth0: invalid CIDR address: 1.2.3.5/255.255.0.0
For now I can get by with hard coding the CIDR netmask in the iPXE config as above.
RancherOS Version: (ros os version) 1.3.0
Where are you running RancherOS? (docker-machine, AWS, GCE, baremetal, etc.) Baremetal
I'm running into a network initialization loop when setting MTU via DHCP interface-mtu on a machine using the
igb
network driver. Same issue as seen in this CoreOS bug: https://github.com/coreos/bugs/issues/1827.When the MTU is set by dhcpcd, the igb driver cycles the link. When the link goes down, it appears /lib/dhcpcd/dhcpcd-hooks/10-mtu restores the MTU to 1500. When the link comes back up, dhcpcd appears to detect this, and then sends out a new discover, sets the MTU to 9000, igb cycles the link, and the whole process repeats ad nauseam.
Various attempts at workarounds:
-K, --nolink
option. Setting this via kernel cmdlinerancher.network.interfaces.eth0.dhcp_args
appears to cause dhcpcd to never attempt a discover on boot.Note, I had to use an alternate separator after
mac
instead of=
. When using an equals, netconf would fail to parse to configuration and cause a panic on boot, eg:netconf.findMatch allows an arbitrary separator after
mac
as it's only looking for themac
prefix: https://github.com/rancher/os/blob/master/netconf/netconf_linux.go#L110.It would be useful to add a test for netconf.findMatch with alternate separators after
mac
to make sure this functionality isn't accidentally removed in the future. It may also be useful for others to show an example of this in the docs.The only issue with the iPXE approach currently is the iPXE variable ${netmask} is not in CIDR notation. netconf.applyAddress calls netlink.ParseAddr, which calls netlink.ParseIPNet. ParseIPNet passes the address and CIDR mask through net.ParseCIDR: https://github.com/vishvananda/netlink/blob/master/netlink.go#L26. Thus, when trying to use 255.255.0.0, network configuration fails with:
For now I can get by with hard coding the CIDR netmask in the iPXE config as above.