Open trevex opened 1 week ago
A potential solution could be to sort the IPs by preferred flags here: https://github.com/siderolabs/talos/blob/e26d0043e022eccf5ea9c9d9b4a57e4bff1f80cc/internal/app/machined/pkg/controllers/network/node_address.go#L154C1-L155C1
However this would mean addresses in NodeAddress objects are sorted by preference rather than alphabetically.
If this is a valid solution I could draft up a PR.
I agree it might be better for IPv6, but you can use also https://www.talos.dev/v1.8/introduction/prodnotes/#multihoming
I am not sure how this helps here. Both addresses are from the same subnet.
KubeVirt's Passt network binding (which is currently the only fully functional IPv6 option supporting the primary pod network) announces the Pod Subnet (of the hosting cluster) as Prefix via RA and Talos will derive a SLAAC/Temp and follow it up with DHCPv6.
This means the SLAAC and DHCPv6 assigned IP are in the same subnet. I don't see a reasonable subnet filter to specify.
The SLAAC address itself is not reachable by the underlying pod network of the KubeVirt hosting cluster. Using it for etcd or kubelet will break connectivity. This is stopping Talos from scaling beyond a single node in an IPv6 KubeVirt environment as an unreachable IP will be advertised.
Is sorting the IPs alphanumerical and by preference based on flags a suitable solution (on top of the existing filtering)? If so, the changes required should be minimal and I might be able to draft up a PR.
It might be worth mentioning that the kubelet will choose the correct IP if no node IP is specified. This is the case with a kubeadm setup based on KubeVirt. From my understanding the Kubelet is using https://github.com/kubernetes/apimachinery/blob/v0.31.2/pkg/util/net/interface.go#L468 under the hood to choose the address.
I understand the issue, but I'd like to make sure we have a proper solution ground up for IPv6, so I don't want to rush into fixing this until we have a proper testbed for IPv6 we can use to ensure proper operations going forward.
I know it doesn't sound too much fun, but the proper IP can be selected with /128
match if the IP is known beforehand.
I know it doesn't sound too much fun, but the proper IP can be selected with /128 match if the IP is known beforehand.
Unfortunately the VM's IP is a Pod IP, so for KubeVirt IPv6 (omni-infra-provider-kubevirt
) use-cases this is not an option and blocking adoption, but I understand the desire to find the best solution
I think it does make sense to prefer IPv6 addresses based on flags (not sure if we can omit mngmtmpaddr
completely from NodeAddresses
?)
KubeVirt's Passt network binding (which is currently the only fully functional IPv6 option supporting the primary pod network) announces the Pod Subnet (of the hosting cluster) as Prefix via RA and Talos will derive a SLAAC/Temp and follow it up with DHCPv6.
By the way, passt does this because you can't "turn off SLAAC" while sending router advertisements (the M
flag is set, but it doesn't tell a node to skip SLAAC). You can disable router advertisements with passt's --no-ra
option, but then you'd be missing the route.
But passt also does this because it works with Linux, as addresses with the longest prefixes are preferred as source addresses, see __ipv6_dev_get_saddr() and ipv6_get_saddr_eval() (rule #8) in net/ipv6/addrconf.c for details.
Now, without making this as generic as the Linux kernel, I guess it would be anyway reasonable to pick the longest matching prefix as preferred address.
I think it does make sense to prefer IPv6 addresses based on flags (not sure if we can omit
mngmtmpaddr
completely fromNodeAddresses
?)
Funny enough in our bare-metal Talos setup we do not use DHCPv6 so the SLAAC address is used. A preference based on longest matching prefix sounds like a reasonable approach.
Funny enough in our bare-metal Talos setup we do not use DHCPv6 so the SLAAC address is used.
The main reason why passt implements a (minimalistic) DHCPv6 server is that, I've been told, having the same exact address inside and outside the guest is convenient for integration with some container-oriented service meshes that assume "host networking" (hence, addressing).
Funny enough in our bare-metal Talos setup we do not use DHCPv6 so the SLAAC address is used.
The main reason why passt implements a (minimalistic) DHCPv6 server is that, I've been told, having the same exact address inside and outside the guest is convenient for integration with some container-oriented service meshes that assume "host networking" (hence, addressing).
Yes, and it is also a necessity to run Kubernetes Clusters in KubeVirt either through CAPI or Omni/Talos.
@smira Does Talos have a "feature gate" functionality allowing us to hide the changed behaviour behind a feature gate?
Yes, and it is also a necessity to run Kubernetes Clusters in KubeVirt either through CAPI or Omni/Talos.
@smira Does Talos have a "feature gate" functionality allowing us to hide the changed behaviour behind a feature gate?
Yes, we do have feature gates, if you could open a proposed PR, we can make a feature gate, and even enable it by default for new clusters on 1.9.
Over the weekend I figured there might be a (dirty) workaround for the Kubevirt use-case (will not help for bare-metal IPv6 use-cases involving DHCPv6):
Spoofing the MAC address of VMs allows us to predict the IP, so we can blacklist it.
Unfortunately blacklisting does not seem to be supported anymore. The documentation mentions the use of !
, but this is not handled in code (anymore).
This will leave the node in a non-functional state:
# cat fdae:41e4:649b:9303:9cd5:e54b:8120:4adb/resources/nodeipconfigs.kubernetes.talos.dev.yaml
metadata:
namespace: k8s
type: NodeIPConfigs.kubernetes.talos.dev
id: kubelet
version: 1
owner: k8s.NodeIPConfigController
phase: running
created: 2024-11-18T11:29:36Z
updated: 2024-11-18T11:29:36Z
spec:
validSubnets:
- '!fd01:cafe::dcad:ff:fe00:beaf/128'
excludeSubnets:
- fd90:cafe::/64
- fd95:cafe::/108
This might be either outdated documentation or another bug report.
I'll start working on a PR to establish a preference for IPv6 IPs ASAP.
Bug Report
Description
When Talos is run in an IPv6 Single-Stack environment and is assigned multiple IPs by DHCP and RA (although this will most likely apply to Dual-Stack as well) the Kubelet will use the wrong Address.
In our case Talos is running in KubeVirt with the Passt network binding plugin and gets an IP via RA followed by an /128 IP from DHCPv6. Only the latter has full bi-directional connectivity.
The preferred /128 address has the flag
permanent
while the RA address has the flagmngmtmpaddr
.The
permanent
address should be preferred.Logs
Relevant excerpts from
omnictl support
:AddressStatuses
NodeAddresses:
NodeIPs:
Environment
talosctl version --nodes <problematic nodes>
] v1.8.2kubectl version --short
] v1.30.1