canonical / cloud-init

Official upstream for the cloud-init: cloud instance initialization
https://cloud-init.io/
Other
2.88k stars 859 forks source link

network config can't override kernel command line arguments #4089

Closed ubuntu-server-builder closed 1 year ago

ubuntu-server-builder commented 1 year ago

This bug was originally filed in Launchpad as LP: #2011298

Launchpad details
affected_projects = []
assignee = None
assignee_name = None
date_closed = 2023-03-13T17:57:40.933444+00:00
date_created = 2023-03-11T13:04:38.500430+00:00
date_fix_committed = None
date_fix_released = None
id = 2011298
importance = undecided
is_complete = True
lp_url = https://bugs.launchpad.net/cloud-init/+bug/2011298
milestone = None
owner = surda
owner_name = Peter Surda
private = False
status = invalid
submitter = surda
submitter_name = Peter Surda
tags = []
duplicates = []

Launchpad user Peter Surda(surda) wrote on 2023-03-11T13:04:38.500430+00:00

This isn't really a bug, it's more of a missing functionality.

The quick description is that if you specify the network configuration by a kernel command line parameter (ip=blahblahblah), there doesn't appear to be a way to override it by defining a different network setup in cloud-init. The kernel command line will always take precedence no matter what you define in meta-data or user-data.

This is the conclusion I drew both from the documentation, as well as by looking at the source on GitHub. The kernel command line has absolute priority and can't be overriden. I found a workaround, by writing the config file (/etc/netplan/50-cloud-init.yml in my case) directly through "write_files" and then executing "netplan apply" in "runcmd". This does what I want, but it doesn't look nice.

Perhaps you're wondering why I need this. I boot ubuntu MAAS images over network. In some data centers DHCP isn't available and at the same time I need to assign more than one IP address to the system, and/or configure bonding. So the bootloader constructs the kernel command line, the machine boots, but only has one ethernet interface configured, with one IP address. I can't remove the kernel command line argument, as then cloud-init doesn't know how to configure the network in order to download the user-data / meta-data.

Please consider providing an alternative approach to the workaround I explained above.

ubuntu-server-builder commented 1 year ago

Launchpad user James Falcon(falcojr) wrote on 2023-03-13T13:39:05.951658+00:00

As you mention, this is by design and I'm not sure there's an easy path to change this due to backwards compatibility issues. In general, it shouldn't be necessary to have multiple network configs, and there's not enough information in your example for me to understand the use case.

Is it possible to provide a more concrete example? Can you attach the arguments being provided on the kernel cli along with the 50-cloud-init.yml you're manually generating, or any other config you want applied?

ubuntu-server-builder commented 1 year ago

Launchpad user Peter Surda(surda) wrote on 2023-03-13T14:07:42.516856+00:00

Here is sanitized data to match the description from earlier:

kernel command line:

initrd=initrd.cpio initrd=squashfs rootfstype=squashfs root=/squashfs ip=1.2.3.4::1.2.3.1:255.255.255.248::eno1:off:1.1.1.1 overlayroot=tmpfs:recurse=0 systemd.clock-usec=1678710046000000 ds=nocloud-net;s=https://cloud-init.sanitized/

/etc/netplan/50-cloud-init.yaml (which I generate separately)

network: version: 2 ethernets: eno1: addresses: [] dhcp4: false dhcp6: false match: macaddress: 11:22:33:44:55:66 eno2: addresses: [] dhcp4: false dhcp6: false match: macaddress: 11:22:33:44:55:67 bonds: bond0: addresses:

So the server boots and can run based on the network config provided in the command line (a single IP address assigned by bootloader to whichever interface is plugged in). However, after cloud-init finshes, the server needs to run containers / VMs and I need to assign more than one IP address to it, and the ethernet devices need to use bonding. I suppose I could use move this functionality to ansible, which is what I use after cloud-init finishes, but then the network configuration is split across two different structures and I'd like to avoid that.

I understand this is a fringe case. If addressing this would be too much work, maybe update documentation to explain that in situations like this, you can work around the limitations by putting the netplan yaml into write_files and then use netplan apply in runcmd, so that other people will have it easier than me.

ubuntu-server-builder commented 1 year ago

Launchpad user James Falcon(falcojr) wrote on 2023-03-13T17:57:36.148473+00:00

As you say, it's kind of a fringe case for us. Cloud-init expects there to be an authoritative source for network configuration. Most clouds have an link-local metadata service that cloud-init can connect to before networking has come up to get the networking config. In the absence of this or a config on disk, it'll default to DHCP on eth0 (or similar) interface. Beyond this, it is expected that users can configure things themselves.

Note that after first boot, if you don't want cloud-init involved at all with the network, you may want to also set network configuration disabled like so: https://cloudinit.readthedocs.io/en/latest/reference/network-config.html#disabling-network-configuration

I'm going to close this as Invalid since it works as intended, but if we find a stronger use case, we can always re-open it.