harvester / harvester

Open source hyperconverged infrastructure (HCI) software
https://harvesterhci.io/
Apache License 2.0
3.74k stars 314 forks source link

[BUG] Network interfaces down after boot on non-default MTU #4780

Open UninspiredNickname opened 9 months ago

UninspiredNickname commented 9 months ago

Describe the bug After boot if MTU has been set to something other than 1500, and network interface doesn't support MTU of 1500 network interfaces will fail to be brought up.

To Reproduce Steps to reproduce the behavior:

  1. Install Harvester (MTU for mgmt-br and mgmt-bo has to be changed manually via tty2 during install - setting it via MTU field doesn't change this value)
  2. On first boot networking will be down

Expected behavior MTU should be set correctly to mgmt-br and mgmt-bo interfaces and they should be brought up

Environment

Additional context It seems that MTU is not applied to mgmt-bo and mgmt-br (ip link shows 1500 instead of 1442, even though it's set in /etc/sysconfig/network/ifcfg-mgmt-br and ifcfg-mgmt-bo). The simplest way to apply this change is to call wicked ifup all twice (log output from first and second call attached below). Other way, purely using ip - without using wicked is to do this:

ip link set mgmt-br mtu 1442
ip link set mgmt-bo mtu 1442
ip link set ens3 up
ip link set ens3 down

I've tried adding ip link set $INTERFACE mtu 1442 to pre-up and post-up in wicked scripts, and it does correctly apply MTU to interfaces, but ens3 has to be still toggled on and off for interfaces to be brought up.

first.log second.log

UninspiredNickname commented 9 months ago

My awful fix is to add ip link set $INTERFACE mtu 1442 as pre-up action to wicked:setup_bond.sh and wicked:setup_bridge.sh + add wicked ifup all to stages.network.commands in /oem/90_custom.yaml, but it seems to me that this is a problem with wicked - as it tries to bring ens3 interface without applying modified mtu first:

kernel: ens3: mtu greater than device maximum
kernel: mgmt-bo: (slave ens3): Error -22 calling dev_set_mtu
lotherk commented 3 months ago

It seems the installer is only adding the MTU into the bridge configs but not into the actual configuration file of the interface.

I fixed this with adding MTU=1400 to the ifcfg-ens18 file in /etc/sysconfig/network/.

innobead commented 3 months ago

cc @mingshuoqiu @starbops

w13915984028 commented 3 months ago

Harvester does not support nested virtualization for production.

In the installer, it allows to set the MTU for mgmt bond interface https://github.com/harvester/harvester-installer/blob/473895517c8304182ef85e5d26c64b8f579d03e3/pkg/config/config.go#L81, but not touching the bonded individual NICs.

In this example, the Harvester Node is backed by an OpenStack VM which is using MTU 1442. The fist issue/challenge is that before Harvester installer program runs, the OS needs to use MTU 1442 to set the NIC and let the NIC up. Per @lotherk 's solution, using a network config to guide the OS. This is the feasible solution. Sure, @UninspiredNickname 's solution can also work.

We can add @lotherk 's solution to https://docs.harvesterhci.io/v1.3/troubleshooting/index first.

On the other hand, if the network supports JUMBO frames, and user sets the mgmt interface with MTU 9000 but the underlayer NICs are still using the default MTU 1500, this may be an issue, needs double check. Harvester may propogate the user configured MTU to the bonded NICs.