canonical / cloud-init

Official upstream for the cloud-init: cloud instance initialization
https://cloud-init.io/
Other
2.86k stars 855 forks source link

cloud-init incorrectly always uses sysconfig as the Network Renderer for RHEL9 #5612

Closed chewborg closed 2 weeks ago

chewborg commented 1 month ago

Bug report

Since RHEL 9 the default storage of network configurations is now NetworkManager keyfiles, not sysconfig network ifcfg files. https://www.redhat.com/en/blog/rhel-9-networking-say-goodbye-ifcfg-files-and-hello-keyfiles https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/9/html-single/9.0_release_notes/index#enhancement_networking

However a change to the decision evaluation on which network renderer to use now causes sysconfig to be used in all cases, rather than the expected network-manager renderer. Issue 4131 Pull Request 4132 This is because the RHEL9 NetworkManager package also always includes the file /usr/lib64/NetworkManager/*/libnm-settings-plugin-ifcfg-rh.so, now used as the inescapable determination of whether to use the sysconfig or network-manager renderers, though using NetworkManager keyfiles now is the default.

So cloud-init is overriding the default behaviour and the result is a RHEL9 OS configured with cloud-init is in an unexpect state. Which is the Interfaces are managed by NetworkManager as ifcfg files in /etc/sysconfig/network-scripts/ rather than the expected keyfiles in /etc/NetworkManager/system-configurations/

Additional steps then need to be undertaken to migrate these configurations to keyfile format.

We expect to find the default state for RHEL9 Networking when using an unmodified cloud-config, and there does not appear to be a way for overriding this except to add a custom renderer list in /etc/cloud/cloud.cfg.d/ excluding sysconfig in custom images.

Steps to reproduce the problem

  1. On a RHEL9 or Rocky9 OS, install the default cloud-init package from the RHEL repos.
  2. Use the image created from this instance in Openstack to create a new VM.
  3. It will always create the Interfaces backend as /etc/sysconfig/network-scripts/ifcfg-{intf}

Environment details

[root@rhel-as9-tc6k log]# cat /etc/redhat-release 
Red Hat Enterprise Linux release 9.4 (Plow)
[root@rhel-as9 log]# cat /etc/sysconfig/network-scripts/readme-ifcfg-rh.txt 
NetworkManager stores new network profiles in keyfile format in the
/etc/NetworkManager/system-connections/ directory.

Previously, NetworkManager stored network profiles in ifcfg format
in this directory (/etc/sysconfig/network-scripts/). However, the ifcfg
format is deprecated. By default, NetworkManager no longer creates
new profiles in this format
...

cloud-init logs

2024-07-15 20:20:17,714 - subp.py[DEBUG]: Running command ['ip', '-6', 'addr', 'show', 'permanent', 'scope', 'global'] with allowed return codes [0]
 (shell=False, capture=True)
2024-07-15 20:20:17,718 - subp.py[DEBUG]: Running command ['ip', '-4', 'addr', 'show'] with allowed return codes [0] (shell=False, capture=True)
2024-07-15 20:20:17,723 - __init__.py[DEBUG]: Detected interfaces {'lo': {'downable': False, 'device_id': None, 'driver': None, 'mac': '00:00:00:00:
00:00', 'name': 'lo', 'up': True}, 'eth0': {'downable': True, 'device_id': '0x0001', 'driver': 'virtio_net', 'mac': 'fa:16:3e:5c:c0:52', 'name': 'et
h0', 'up': False}}
2024-07-15 20:20:17,723 - __init__.py[DEBUG]: no work necessary for renaming of [['fa:16:3e:5c:c0:52', 'eth0', 'virtio_net', '0x0001']]
2024-07-15 20:20:17,723 - stages.py[INFO]: Applying network configuration from ds bringup=False: {'version': 1, 'config': [{'type': 'physical', 'mtu': 1500, 'subnets': [{'type': 'dhcp4'}], 'mac_address': 'fa:16:3e:5c:c0:52', 'name': 'eth0'}, {'type': 'nameserver', 'address': '10.26.2.2'}, {'type': 'nameserver', 'address': '10.26.3.192'}, {'type': 'nameserver', 'address': '10.26.3.21'}, {'type': 'nameserver', 'address': '10.26.3.144'}]}
2024-07-15 20:20:17,724 - util.py[DEBUG]: Writing to /run/cloud-init/sem/apply_network_config.once - wb: [644] 23 bytes
2024-07-15 20:20:17,725 - util.py[DEBUG]: Restoring selinux mode for /run/cloud-init/sem/apply_network_config.once (recursive=False)
2024-07-15 20:20:17,726 - util.py[DEBUG]: Restoring selinux mode for /run/cloud-init/sem/apply_network_config.once (recursive=False)
2024-07-15 20:20:17,729 - __init__.py[DEBUG]: Selected renderer 'sysconfig' from priority list: ['sysconfig', 'eni', 'netplan', 'network-manager', 'networkd']
2024-07-15 20:20:17,733 - util.py[DEBUG]: Writing to /etc/sysconfig/network-scripts/ifcfg-eth0 - wb: [644] 176 bytes
2024-07-15 20:20:17,734 - util.py[DEBUG]: Restoring selinux mode for /etc/sysconfig/network-scripts/ifcfg-eth0 (recursive=False)
2024-07-15 20:20:17,735 - util.py[DEBUG]: Restoring selinux mode for /etc/sysconfig/network-scripts/ifcfg-eth0 (recursive=False)
2024-07-15 20:20:17,735 - util.py[DEBUG]: Reading from /etc/resolv.conf (quiet=False)
2024-07-15 20:20:17,735 - util.py[DEBUG]: Read 55 bytes from /etc/resolv.conf
2024-07-15 20:20:17,735 - util.py[DEBUG]: Writing to /etc/resolv.conf - wb: [644] 198 bytes
2024-07-15 20:20:17,736 - util.py[DEBUG]: Restoring selinux mode for /etc/resolv.conf (recursive=False)
2024-07-15 20:20:17,737 - util.py[DEBUG]: Restoring selinux mode for /etc/resolv.conf (recursive=False)
2024-07-15 20:20:17,738 - util.py[DEBUG]: Writing to /etc/NetworkManager/conf.d/99-cloud-init.conf - wb: [644] 72 bytes
2024-07-15 20:20:17,738 - util.py[DEBUG]: Restoring selinux mode for /etc/NetworkManager/conf.d/99-cloud-init.conf (recursive=False)
2024-07-15 20:20:17,739 - util.py[DEBUG]: Restoring selinux mode for /etc/NetworkManager/conf.d/99-cloud-init.conf (recursive=False)
2024-07-15 20:20:17,739 - util.py[DEBUG]: Writing to /etc/udev/rules.d/70-persistent-net.rules - wb: [644] 96 bytes
2024-07-15 20:20:17,739 - util.py[DEBUG]: Restoring selinux mode for /etc/udev/rules.d/70-persistent-net.rules (recursive=False)
2024-07-15 20:20:17,740 - util.py[DEBUG]: Restoring selinux mode for /etc/udev/rules.d/70-persistent-net.rules (recursive=False)
2024-07-15 20:20:17,740 - util.py[DEBUG]: Reading from /etc/sysconfig/network (quiet=True)
2024-07-15 20:20:17,740 - util.py[DEBUG]: Read 37 bytes from /etc/sysconfig/network
2024-07-15 20:20:17,740 - util.py[DEBUG]: Writing to /etc/sysconfig/network - wb: [644] 107 bytes
2024-07-15 20:20:17,742 - util.py[DEBUG]: Restoring selinux mode for /etc/sysconfig/network (recursive=False)
2024-07-15 20:20:17,742 - util.py[DEBUG]: Restoring selinux mode for /etc/sysconfig/network (recursive=False)
2024-07-15 20:20:17,743 - __init__.py[DEBUG]: Not bringing up newly configured network interfaces
2024-07-15 20:20:17,743 - main.py[DEBUG]: [local] Exiting. datasource DataSourceOpenStackLocal [net,ver=2] not in local mode.
blackboxsw commented 4 weeks ago

Thank you @chewborg for filing this bug and improving cloud-init. Can you clarify what behavior is broken by cloud-init's decision to render sysconfig files instead of /etc/NetworkManager/system-connections/. Is network inoperable, does this conflict with other network configuration emitted on the system in NetworkManager/system-connection/*?

My understanding from the bug is that cloud-init is still rendering content in what's considered "deprecated behavior" by the /etc/sysconfig/network-scripts/readme-ifcfg-rh.txt file.

@ani-sinha this request/issue seems at conflict with the RedHat/Rocky specific changes we added in #4132 which was a strict decision to always use sysconfig if you have the plugin ifcfg-rh. I have no visibility to RHBZ: 2194050 so I don't know specifically what bug #4132 was trying to fix.

Do we want to consider either of the following:

ani-sinha commented 4 weeks ago

In RHEL 9 use of Sysconfig renderer is the default. That is because the Sysconfig renderer has the higher priority and the ifcfg plug-in is available. We will never make a major decision to flip the switch and always use network manager renderer by default in the middle of the major RHEL version. If users want, they can override the default priority and use network manager renderer with a higher priority so that it takes precedence over Sysconfig renderer.

In RHEL 10 network manager renderer is the default and is the only way to configure the network. Therefore network manager renderer has the higher priority over Sysconfig renderer (and you can't enable Sysconfig renderer as the ifcfg plugin is absent).

ani-sinha commented 4 weeks ago

In terms of documentation, I will draw the attention of our doc team to check if this has been documented in one of RH KCs , if not add one.

ani-sinha commented 4 weeks ago

Another data point is this - RHEL 9.2 and below did not support network manager renderer. It's only RHEL 9.3 and above that supports both Sysconfig and network manager renderer. Hence we won't make network manager renderer default for RHEL 9. Instead we will use the RHEL 9 train to stabilize network manager renderer before making the big switch to making it the default in RHEL 10.

chewborg commented 3 weeks ago

Hi @ani-sinha

Now I'm not sure if we are talking about cloud-init or the RHEL OS in the above For a RHEL v9.0 installation the default is to use NetworkManager keyfiles, and I just did a quick KVM install via the ISO to verify

[root@rhel90-iso-install patrick]# nmcli -f NAME,TYPE,AUTOCONNECT,ACTIVE,DEVICE,STATE,FILENAME connection show 
NAME    TYPE      AUTOCONNECT  ACTIVE  DEVICE  STATE      FILENAME                                                   
enp1s0  ethernet  yes          yes     enp1s0  activated  /etc/NetworkManager/system-connections/enp1s0.nmconnection 
[root@rhel90-iso-install patrick]# cat /etc/redhat-release 
Red Hat Enterprise Linux release 9.0 (Plow)

But perhaps I'm misinterpreting and you are telling me that cloud-init should only use sysconfig and not the network-manager renderer. If so I wasn't aware of this, or at least I didn't find this documented. I'm probably out of my depth knowledgewise so I guess I didn't find what I expected, and I'm probably not alone there.

Honestly, as the nmcli tool will work with either ifcfg or nmconnection files for most of its functionality it's not a big deal I guess. If you want your vm to use keyfiles For a Openstack VM using cloud-init you can either run nmcli connection migrate

Or create a custom image with something like

cat /etc/cloud/cloud.cfg.d/92_network_setup.cfg 
system_info:
  network:
    renderers: ['network-manager']

And you'll get the state we wish to use.

In that case I think a solution to the issue I raised is to ensure the documention for cloud-init that the Network Renderer will be sysconfig by default for RHEL 9 versions (and clones) using cloud-init.

ani-sinha commented 3 weeks ago

@chewborg

Now I'm not sure if we are talking about cloud-init or the RHEL OS in the above

cloud-init.

But perhaps I'm misinterpreting and you are telling me that cloud-init should only use sysconfig and not the network-manager renderer.

No, what I am saying is this. For RHEL 9.0, 9.1 and 9.2, cloud-init can only use syscnfig renderer as support for network manager renderer is absent. From RHEL 9.3 onwards, both sysconfig and network manager renderer can be used.

Or create a custom image with something like

cat /etc/cloud/cloud.cfg.d/92_network_setup.cfg 
system_info:
 network:
   renderers: ['network-manager']

So essentially what you are doing here is removing other renderers and keeping only network-manager renderer. You do not need to do that. You can simply give network-manager renderer a higher priority over sysconfig, something like:

  network:
    renderers: ['netplan', 'network-manager', 'networkd', 'sysconfig', 'eni']

and then it will give priority to network-manager renderer. Again, remember that this will only work for RHEL 9.3 and above. By default, cloud-init in RHEL 9.3 and above will continue to use sysconfig renderer.

I will check with our doc team to see if we can better document this.

TheRealFalcon commented 2 weeks ago

Thanks for the additional context Ani.

From upstream's perspective, this works as expected. Cloud-init correctly determines that sysconfig is installed and uses the sysconfig renderer if sysconfig is orderered earlier in the renderer list.

We have a templated config file that can be modified to specify renderer ordering when building cloud-init, though it looks like this is already being overridden downstream. If you want to submit a PR to make a change specific to Rocky in this template, we're happy to accept it.