Improve Network Configuration

cprivitere commented 2 years ago

What I'd like: We're supporting using bottlerocket on bare metal as part of EKS Anywhere running on Equinix Metal.

One of the most limiting and difficult to configure aspects is the NIC configuration. This is due to bare metal servers (even those of a particular plan type like m3.large.x86 or c3.medium.x86) not always having identical configurations. Sometimes network cards aren't the same, sometimes they don't all get plugged into the same slot. The current method of specifying the exact name of the networking card makes supporting this sort of environment near impossible.

Ideas we think would be good:

Some way to specify a match pattern for NICs that bottlerocket will configure any that match a specified pattern. IE instead of enp5s0f0np0, just specify ens*

The ability to specify multiple NICs and bottlerocket to try configuring them in order and once one works, use it as primary. So in this scenario we would have a net.toml that looks like this (primary=auto just a made up idea to express the desired behavior):

            [enp131s0f0np0]
            dhcp4 = true
            dhcp6 = false
            primary = auto  
            [enp5s0f0np0]
            dhcp4 = true
            dhcp6 = false
            primary = auto  
            [enp1s0f0np0]
            dhcp4 = true
            dhcp6 = false
            primary = auto

The ability to specify MAC address and have it configure whatever NIC has that address.
Auto configuration of any NICs discovered to just do DHCP and pick the first available to communicate on.

zmrow commented 2 years ago

Thanks for the issue @cprivitere ! This is an area of Bottlerocket we're looking to continuously improve on, and these types of suggestions are really helpful.

A few quick thoughts as I read over these:

Some way to specify a match pattern for NICs that bottlerocket will configure any that match a specified pattern. IE instead of enp5s0f0np0, just specify ens*

This "match" use case is compelling; I'm wondering how we determine which NIC becomes primary if we're bringing up anything matching ens*.

The ability to specify multiple NICs and bottlerocket to try configuring them in order and once one works, use it as primary.

Currently the "config generation" is a separate step from "bring up the NIC and wait for a lease", the latter of which is taken care of by the network manager (currently wicked). The network manager uses config to understand which NIC to bring up, and has to wait for a lease for each. If certain NICs don't exist, or aren't cabled, etc. the network manager would be waiting up to some timeout for each one, delaying the boot.

The ability to specify MAC address and have it configure whatever NIC has that address.

We've got an open issue for that! #2293

Auto configuration of any NICs discovered to just do DHCP and pick the first available to communicate on.

I like this idea, especially for quick development and a sane default. It does however, raise the same question around "primary interface". :thinking:

Cajga commented 2 years ago

As I wrote in #2293, we are running into the same issue (EKS-A Bare Metal + no one knows the interface name ahead). I just wanted to add one more relevant problem: When someone boots up a bottlerocket OS with wrong interface name in net.toml, there is not much useful information on the console: https://imgur.com/a/SmMYG2v

Apart from this, one cannot do anything to fix the problem itself even with access to the console. Please bear with me in case this is a stupid question as I do not have too much experience with bottlerocket yet, but is there a way to somehow let the user fix this problem with console access?

zmrow commented 2 years ago

@Cajga Thanks for the extra info. I totally understand the pain. As I mentioned above, the "interface config generation" is separate from bringing up the actual interfaces. netdog generates the interface configuration based on net.toml, and doesn't (currently) do any hardware probing or understand what "should" be in the box. wicked attempts to bring up the interfaces it has config for, it also doesn't know what could or should be in the box.

It brings up the philosophical question "What is the wrong interface name?". Perhaps the user specifies an interface that should be in the box, but the hardware malfunctions and doesn't come up properly.

Another thing we have to consider when thinking about a solution is udev is doing things in parallel while the machine is booting. At config generation time the interface may not have been renamed by udev to its final name.

All that being said, we do want to make this more straightforward and easier to debug. I do like the idea of using MAC address or some type of name match to make configuration easier. A thought: if we used MAC address we could ensure a NIC exists (regardless of the name) with that MAC and if not, fail the config generation which would print a message to the console for easier debugging. :thinking:

To answer your last question - Bottlerocket doesn't include a shell so there really isn't a way to drop into some sort of recovery shell.

cprivitere commented 2 years ago

So I would challenge the following presumptions:

1) That netdog + wicked is the right solution for bottlerocket (to be fair, I think you alluded to at least considering alternatives to wicked above, already). This current setup seems to provide challenges for bare metal and should be questioned if the OS is meant to work on non-virtualized platforms..which it is now.

2) That there's no way to get a shell. The docs indicate there's a way spawn a container with a shell if you provide the correct configuration. - https://github.com/bottlerocket-os/bottlerocket#admin-container - Perhaps making an easy way to trigger that config by passing a simple debug parameter via the bootconfig or something would help folks trying to do debug work on bottlerocket.

displague commented 2 years ago

This "match" use case is compelling; I'm wondering how we determine which NIC becomes primary if we're bringing up anything matching ens*.

The netplan solution (https://netplan.io/design) seems to be, to leave it to the user. The network devices are subgrouped into ethernets and wifis and within those groups the user can specify the macaddresses, drivers, and device names to match.

"primary" is not a meaningful distinction since kernel networking doesn't share that concept. If the significance of a "primary" interface is that bootup will wait for this device to be configure, then perhaps that behavior should be the parameter: required = true (optional = true) or timeout = <timeout>.

netdog + wicked is the right solution for bottlerocket

One of netplan's features is that it can render various configurations. Imagine how this could be done if a netdog renderer was added to netplan and netplan's syntax (cloud-config's networking v2 syntax) could be baked into the bottlerocket configuration.

bcressey commented 2 years ago

That netdog + wicked is the right solution for bottlerocket (to be fair, I think you alluded to at least considering alternatives to wicked above, already). This current setup seems to provide challenges for bare metal and should be questioned if the OS is meant to work on non-virtualized platforms..which it is now.

As you say, this is being challenged pretty actively by @zmrow at the moment. It's clear that a larger change is needed to keep up with the rate of feature requests for network configuration. However, netdog + wicked is what's in place at the moment, and for a lot of reasons (support, maintenance) I really want a consistent approach across all platforms. That means some sort of helper to handle network configuration chores on platforms where it isn't a preoccupying concern (vmware, aws), which is the role that netdog fills.

That there's no way to get a shell. The docs indicate there's a way spawn a container with a shell if you provide the correct configuration. - https://github.com/bottlerocket-os/bottlerocket#admin-container - Perhaps making an easy way to trigger that config by passing a simple debug parameter via the bootconfig or something would help folks trying to do debug work on bottlerocket.

The admin container is only accessible if the network interface needed to access the container registry is configured, so there's an unfortunate bootstrapping problem. This is a longstanding pain point (e.g. #385) where the pragmatic approach ("add a shell") conflicts squarely with the larger architectural requirement ("no shell").

I've mulled over the idea of some sort of "safe mode" partition with a shell that could be rebooted into if the first boot fails, and would be inaccessible thereafter. That wouldn't be a silver bullet but at least the journal would be available, assuming setup for local storage worked.

bcressey commented 2 years ago

"primary" is not a meaningful distinction since kernel networking doesn't share that concept. If the significance of a "primary" interface is that bootup will wait for this device to be configure, then perhaps that behavior should be the parameter: required = true (optional = true) or timeout = <timeout>.

Primary is currently used in two ways:

to determine which interface's DHCP lease should be used to populate /etc/resolv.conf
to determine which interface's reverse DNS lookup should be used to set /proc/sys/kernel/hostname`

These are both higher level concerns leaking down into the exposed settings. Hostname in particular gets this treatment because the "correct" hostname is needed for EC2 nodes to authenticate to EKS.

@zmrow it'd be good to take a hard look at whether a "primary" field is actually needed in the environment where the "match" style interface would be used. If hostname + DNS servers are coming in via provisioning then it might be possible to just not have a primary.

One of netplan's features is that it can render various configurations. Imagine how this could be done if a netdog renderer was added to netplan and netplan's syntax (cloud-config's networking v2 syntax) could be baked into the bottlerocket configuration.

Adding netplan itself to Bottlerocket won't happen since it's written in Python. As an alternative along the same lines, it might be worth looking at netplan-types to see if netdog could be taught to parse netplan files.

I think that would end up in the same place from a functionality perspective, if I understand the goal correctly: supporting netplan.yaml in addition to net.toml.

zmrow commented 2 years ago

@zmrow it'd be good to take a hard look at whether a "primary" field is actually needed in the environment where the "match" style interface would be used. If hostname + DNS servers are coming in via provisioning then it might be possible to just not have a primary.

This is fair. At network config generation time, the API server isn't up yet and we haven't read user data so we don't know what has come in via provisioning. We'll mull this over a bit more... :thinking:

As an alternative along the same lines, it might be worth looking at netplan-types to see if netdog could be taught to parse netplan files

I recently found that crate! It's another option we can consider supporting for sure.

zmrow commented 2 years ago

Just for reference - here's the issue where we are evaluating systemd-networkd.

joewilliams commented 2 years ago

I think I am running into similar problems as the OP. In our case we are migrating from AL2 and using dhclient. We are attempting to use the official Bottlerocket AMI with multiple ENI interfaces (both v4 and v6). Thus far I've only managed to get eth0 working as expected.

In the case of the other interfaces minimally it seems to require manually adding IPs via the admin container. Updating the wicked config to add eth1, etc and restarting the daemons doesn't seem to help so I assume it's something deeper and we'll need to build an AMI and net.toml. Relatedly, it seems like interfaces other than eth0 don't have a REACHABLE next hop. Wickedd has some possibly unexpected behavior with how default routes are configured.

zmrow commented 2 years ago

@joewilliams You mentioned the official Bottlerocket AMI so I'm assuming you're using an AWS variant. This issue isn't really applicable to AWS variants since the majority of network config is handled via CNI plugins, not via net.toml. If you continue to see issues with your setup or need some additional help - let's open up a new issue or start a discussion!

joewilliams commented 2 years ago

@zmrow Thanks! Right, the CNI network stuff is working great. We also have host-based networking config in addition to what the CNI is doing.

stmcginnis commented 1 year ago

@zmrow do you know if there are any specific actions that can be tracked with this issue? Or are there other "work in progress" items being tracked elsewhere? Just wondering how to move this issue forward.

heri16 commented 5 months ago

We are using the BottlerRocket ECS variant. And it fails to detect other ENIs that has been attached. Anyway to solve this?

bottlerocket-os / bottlerocket

Improve Network Configuration #2469