coreos / fedora-coreos-tracker

Issue tracker for Fedora CoreOS
https://fedoraproject.org/coreos/
263 stars 60 forks source link

Set MACAddressPolicy=none for bridges/bonds/teams #919

Closed ghost closed 1 year ago

ghost commented 3 years ago

Describe the bug After creating bond interface via NetworkManager(nmcli, nmtui) it has autogenerated mac address.

Reproduction steps Steps to reproduce the behavior:

  1. Create bond via NetworkManager with at least 2 slaves
  2. Inspect bond mac address

Expected behavior Macaddress of bond interface is one of (first active) slaves.

Actual behavior After creating bond interface via NetworkManager(nmcli, nmtui) it has autogenerated mac address , so this breaks IP address assigning via dhcp in OKD.

System details

Ignition config

Additional information

darkmuggle commented 3 years ago

This is not a bug with either FCOS or Network-Manager. The default behavior is to use a random MAC address. If you require a specific Mac, you will need to configure it via nmcli or nmtui.

dustymabe commented 3 years ago

I guess it depends on how things are configured. According to https://wiki.linuxfoundation.org/networking/bonding#where_does_a_bonding_device_get_its_mac_address_from it should be the MAC of the first device added to the bond, which matches my previous experience here as well (from a previous lifetime before NetworkManager existed). I've played around with the various ethernet.cloned-mac-address NM settings trying to get it to use a non-autogenerated MAC for the bond (i.e. one derived from the subordinate devices) but haven't been able to.

Maybe @thom311 or @bengal could point us in the right direction by telling us which settings to tweak (or if there is a possible bug lurking).

dustymabe commented 3 years ago

Just checked against RHCOS (RHEL). It behaves the way I'd expect where the bond has the MAC of one of the subordinate devices. Notice that one of the entries below has permaddr and one doesn't:

[core@ignitionhost ~]$ rpm -q NetworkManager kernel
NetworkManager-1.30.0-9.el8_4.x86_64
kernel-4.18.0-305.10.2.el8_4.x86_64
[core@ignitionhost ~]$ 
[core@ignitionhost ~]$ ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 52:54:00:55:08:fa brd ff:ff:ff:ff:ff:ff
3: ens3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether 52:54:00:55:08:fa brd ff:ff:ff:ff:ff:ff permaddr 52:54:00:26:3c:ce
4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 52:54:00:55:08:fa brd ff:ff:ff:ff:ff:ff
bengal commented 3 years ago

Since systemd v242, udev sets a predictable MAC address (derived from the machine-id and the interface name) on virtual interfaces, including bond and bridges. As the bond has a MAC address set by userspace, kernel doesn't update the MAC when the first port is added to the bond.

See:

https://github.com/systemd/systemd/blob/v247/NEWS#L2332-L2358

This doesn't happen on RHEL8 because there is an older systemd version there.

To change the MAC of the bond from NetworkManager you should set the ethernet.cloned-mac-address property of the bond connection. That value will override the address set by udev. If this doesn't work, it's probably a bug.

ghost commented 3 years ago

And teaming works right (from my point of view) - takes mac address from first active intreface. But I cant change team runner type in kernel (dracut.cmdline) parameters during pxe installation of coreos. It sets always roundrobin. Bonding works like a charm, but this thing with mac addresses...

dustymabe commented 3 years ago

Thanks @bengal. That's not what I was expecting but certainly is the root cause here. It mentions

if a bridge interface is created without any slaves, and gains a slave later, then now the bridge does not inherit slave's MAC.

Thinking out loud here:

Would it be an option for NM to create the bond/bridge in this case with the subordinate devices from the start rather than adding them subsequently?

@bengal, you mention setting ethernet.cloned-mac-address but there is no way to set it to say "use subordinate device MAC" without actually hardcoding that device's MAC address IIUC. Am I right?

@servsav Want to give the following a try to see if you get the behavior you desire:

# /etc/systemd/network/98-bond-inherit-mac.link
[Match]
Type=bond

[Link]
MACAddressPolicy=none
bengal commented 3 years ago

But I cant change team runner type in kernel (dracut.cmdline) parameters during pxe installation of coreos. It sets always roundrobin.

Changing the team runner from kernel command line was implemented in dracut 51, but only in the network-legacy module:

https://github.com/dracutdevs/dracut/commit/e4483e5917b59918260ff0f0345abbea4a537f12

NetworkManager doesn't yet support the parameter. I filed an issue for that: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/774

ghost commented 3 years ago

@servsav Want to give the following a try to see if you get the behavior you desire:

# /etc/systemd/network/98-bond-inherit-mac.link
[Match]
Type=bond

[Link]
MACAddressPolicy=none

Thanks a lot! This does work as I need!

thom311 commented 3 years ago

IMO the default MACAddressPolicy is a serious misfeature on Fedora. RHEL8 and RHEL9 doesn't do this.

bengal commented 3 years ago

Would it be an option for NM to create the bond/bridge in this case with the subordinate devices from the start rather than adding them subsequently?

No, it's not possible to create the bond with other interfaces already enslaved from the beginning, because in kernel API the master-slave relationship is tracked on the slave through a 'master' attribute. Therefore the master needs to exist initially without any slave, and then ports can be attached by setting their 'master' attribute via netlink.

bengal commented 3 years ago

@bengal, you mention setting ethernet.cloned-mac-address but there is no way to set it to say "use subordinate device MAC" without actually hardcoding that device's MAC address IIUC. Am I right?

Right, there isn't a way. I guess the reasons are:

[*] on RHEL, MACAddressPolicy is disabled

dustymabe commented 3 years ago

IMO the default MACAddressPolicy is a serious misfeature on Fedora. RHEL8 and RHEL9 doesn't do this.

Is this enough of a misfeature that we should try to get it changed by talking with FESCO and submitting a change request?

thom311 commented 3 years ago

Is this enough of a misfeature that we should try to get it changed by talking with FESCO and submitting a change request?

in my opinion "yes". But I recognize, that other people think this is a desirable feature.

It's not only about bond. Although, there it is the most visible, because udev setting the MAC address prevents the automatism that the address gets inherited by the first attached port. If you have a naive script (or a tool that effectively does)

   SOFTWARE=dummy
   ip link add name if0 type $SOFTWARE
   ip link set if0 address 62:b4:fa:7e:2e:4f

then this races against udev changing the MAC address. OK, you might argue that the tool is buggy and it needs to wait for udev:

   SOFTWARE=dummy
   ip link add name if0 type $SOFTWARE
   udevadm settle
   ip link set if0 address 62:b4:fa:7e:2e:4f

(or set the address right away when creating the interface). Even if this is considered a bug of the tool, how many existing tools have this race? There may be reasons why the tool cannot configure the MAC address right away, but then requiring every tool to either exec udevadm or use libudev seems too much.

And the premise, that the MAC address by default should be stable (based on a machine-key and the interface name) is not obviously agreeable to me.

dustymabe commented 2 years ago

xref: https://github.com/systemd/systemd/issues/15208

dustymabe commented 2 years ago

Had a discussion this morning with the NM team and also @keszybz representing systemd. There was some advocation for both the upstream current behavior and the old behavior (at least for bridges and bonds). We decided for now that we (thanks @thom311) will start a discussion with upstream systemd (systemd-devel@lists.freedesktop.org) to debate the merits of MACAddressPolicy=none for bonds and bridges. We will send a separate mail to devel@ Fedora list to encourage interested parties to take part in the upstream discussion.

Also, semi-related to the discussion, I filed a request upstream for being able to control the MACAddressPolicy via kernel arguments: https://github.com/systemd/systemd/issues/23294

dustymabe commented 2 years ago

The Fedora Change Request for this was accepted.

dustymabe commented 1 year ago

This changed was merged in https://src.fedoraproject.org/rpms/systemd/pull-request/100 and first built in systemd-253~rc2-3.fc38.

dustymabe commented 1 year ago

The fix for this went into next stream release 38.20230310.1.0. Please try out the new release and report issues.

EDIT: That release was withdrawn.

dustymabe commented 1 year ago

Just to tie this off, I proposed a dracut change upstream to pick this up for the initramfs, but it was stalling out so I just made a downstream patch PR for the same thing.

dustymabe commented 1 year ago

The fix for this went into next stream release 38.20230322.1.0. Please try out the new release and report issues.

dustymabe commented 1 year ago

The fix for this went into testing stream release 38.20230414.2.0. Please try out the new release and report issues.

dustymabe commented 1 year ago

The fix for this went into stable stream release 38.20230414.3.0.