equinix-labs / terraform-equinix-metal-eks-anywhere

EKS Anywhere on Equinix Metal (Baremetal)
https://deploy.equinix.com/labs/terraform-equinix-metal-eks-anywhere/
Apache License 2.0
9 stars 5 forks source link

Support LACP bonded port configurations #30

Open displague opened 2 years ago

displague commented 2 years ago

It is currently not possible to define the userdata needed for the Tinkerbell booted nodes to configure an LACP bond.

In this Equnix Metal Terraform project, we are currently bringing up the device in L2 Unbonded to work around this limitation.

L2 Bonded mode is the ideal configuration. https://metal.equinix.com/developers/docs/layer2-networking/layer2-mode/

enkelprifti98 commented 2 years ago

LACP bonding support is needed for production workloads since the current L2 Unbonded deployment model brings risks of downtime during network maintenance events.

displague commented 2 years ago

In order to support this, we would need a path for BottleRocket (#20) to define bonding features, such as in cloudconfig network v2 format: https://cloudinit.readthedocs.io/en/latest/topics/network-config-format-v2.html#bonds

Bottle Rocket's Baremetal configuring guide includes some pointer on network configuration: https://github.com/bottlerocket-os/bottlerocket/blob/develop/PROVISIONING-METAL.md

Once the correct configuration is known, we would then need a way to specify this configuration in our Hardware (as userdata and/or metadata) or possibly through Workflow templates (making direct changes to network configuration files).

displague commented 2 years ago

Related issue on Bottlerocket: https://github.com/bottlerocket-os/bottlerocket/issues/2369

displague commented 2 years ago

Here's an example of an L2 LACP bonded configuration in Equinix Metal with static IP assignments. https://github.com/equinix-labs/terraform-metal-hybrid-gateway/blob/main/modules/backend/cloud-config.cfg

In a Tinkerbell environment, DHCP would issue the addresses. https://netplan.io/examples#configuring-interface-bonding provides an example of how to configure this within the OS.

When each worker boots and DHCPs/IPXE's to get the Tinkerbell workflow, will the hook environment also require running in an LACP bonded configuration (in order to fetch images, for example)?

displague commented 1 year ago

Bonding is now supported in Bottlerocket 1.12.0:

Bonds can not be composed of HW addresses today, but this new features of Bottlerocket will allow for bonding interfaces where the NIC name is known and consistent (m3.small.x86, see #33).

enkelprifti98 commented 1 year ago

It looks like bonding in Bottlerocket is limited to mode 1 only (Active-backup). We ideally need support for bonding mode 4 (802.3ad / active-active).

https://github.com/bottlerocket-os/bottlerocket/blob/develop/PROVISIONING-METAL.md#supported-interface-settings