rancher / rke2

https://docs.rke2.io/
Apache License 2.0
1.48k stars 260 forks source link

Public IP used as Internal IP and causing problems even with `node-ip` set and IPs corrected on initialization #6267

Closed Ken-Michalak closed 2 months ago

Ken-Michalak commented 2 months ago

Environmental Info:

RKE2 Version: v1.28.10+rke2r1 (b0d0d687d98f4fa015e7b30aaf2807b50edcc5d7) go version go1.21.9 X:boringcrypto

Node(s) CPU architecture, OS, and Version: Ubuntu 24.04 LTS (GNU/Linux 6.8.0-31-generic x86_64) Linux staging-server-1 6.8.0-31-generic #31-Ubuntu SMP PREEMPT_DYNAMIC Sat Apr 20 00:40:06 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration: 1 server, 1 agent. dual-stack cilium, in Hetzner Cloud, with a public ipv4, public ipv6, and private ipv4.

Describe the bug: On creation, the public ipv4 is used as the internal ip and there doesn't seem to be a combination of configuration that completely fixes that behavior. Even with setting node-ip, tls and having ips corrected by the cloud controller manager on initialization, some configuration remains connected to that external ip instead of the internal one, causing communication issues.

Steps To Reproduce: It seems to be caused by having a public ip in the first network interface, with the private ip on another interface, but I'm not sure how you would set that up outside of creating a server in Hetzner. I'm working on a terraform module here: https://gitlab.com/5stones/tf-modules/-/tree/49aa0d1f7622e88a181d757d2e502b91198146b1/hetzner/rke2 This is the rke2 config that gets things close to working (node-ip is added on boot): https://gitlab.com/5stones/tf-modules/-/blob/49aa0d1f7622e88a181d757d2e502b91198146b1/hetzner/rke2/server.yaml#L27

Expected behavior: I would expect that the scope of addresses with something like ip addr show would be taken into account, and the internal and external ips would be set correctly. If that's not possible, it should at least be able to determine private vs public based on the ip itself, and not set an ip as internal unless it's in the standard private ipv4 address ranges or it's set manually.

Actual behavior: The public ip is used as the internal ip, and the real internal ip is not included. The public ipv6 is also not included in the ips reported by kubectl describe no, even though dual-stack is enabled. Once the cloud controller manager initializes the node, it has the 3 ips correctly assigned, but the api cert or possibly etcd doesn't include the internal ip, so calls to the api server fail.

I attempted to set the node-ip, but kubectl describe node still only shows the external ip as an internal ip, and excludes the internal ip from the api certificate. So that prevented me from even installing the cloud controller manager with a helm crd.

The closest I've gotten to getting things working is by setting the node-ip and adding the internal ip to the tls array, but even after initialization, I get this error in some situations:

Error from server: no preferred addresses found; known addresses: [{Hostname staging-server-1} {ExternalIP 5.161...} {ExternalIP 2a01:...::1}]

Additional context / logs: The public ipv4 and ipv6 is on eth0, and the private ipv4 is on eth7s0.

There's a number of issues similar to this, but they seem to keep getting closed with workarounds instead of being fixed. https://github.com/rancher/rke2/issues/591 https://github.com/rancher/rke2/issues/4759 https://github.com/rancher/dashboard/issues/5603

brandond commented 2 months ago

Are you using the hetzner cloud controller, or the embedded one?

If you're using the embedded one, and don't manually configure internal and external IPs, then it is expected that one would not be set, and the internal IP would just be whatever address the Kubelet picks - which is just the one associated with the default route. This is default kubelet behavior, not anything specific to rke2. The whole point of cloud controllers is to be able to talk to cloud infrastructure provider APIs to figure out what the internal and external IPs are for the node.

If you're deploying the hetzner cloud controller it should handle setting the address properly, but you haven't provided enough info to actually figure out what the problem is, just a few snippets of errors without any context.

Ken-Michalak commented 1 month ago

Are you using the hetzner cloud controller, or the embedded one?

If you're using the embedded one, and don't manually configure internal and external IPs, then it is expected that one would not be set, and the internal IP would just be whatever address the Kubelet picks - which is just the one associated with the default route. This is default kubelet behavior, not anything specific to rke2. The whole point of cloud controllers is to be able to talk to cloud infrastructure provider APIs to figure out what the internal and external IPs are for the node.

If you're deploying the hetzner cloud controller it should handle setting the address properly, but you haven't provided enough info to actually figure out what the problem is, just a few snippets of errors without any context.

It does actually matter even if the ccm corrects it. That's the whole point here. If I do not set the node-ip or tls at all, the automatically chosen ip is used in the certificate, and the cluster does not function after the ccm corrects the ips. Agents can't join, metrics don't work, helm crds won't install, etc.

I don't think this is an upstream issue either, because upstream, all of those components are separate. The kubelet ip changes fine, and on RKE2 agents, I actually don't have to worry about setting node-ip, because there aren't any additional issues caused by it being incorrect before it's initialized. With RKE2 servers though, the components, like etcd and the api server, start in the kubelet, so the internal ip has to be correct from the start.

No matter how you look at it, there's a problem here. If scripting something to set the node-ip to override things is really the only possible solution, you shouldn't have to also include that in the tls. It shouldn't be just using the ip from the default route there. And there still some other configuration I'm missing or can't set. Otherwise I would be able to run kubectl exec without getting that "no preferred addresses found" error. Maybe... that just needs documentation, like "If you have a dual stack cluster, and the ip is wrong, you have to set the [ipv4, ipv6] on the node ip, set tls to ..." but there's no documentation like that, and I can not get the cluster into a functional state.

The basic problem is that the automatically set ip is inaccurate. Even if it's caused by a behavior upstream, it doesn't mean RKE2 can't override it, or that the issue is fixed, when it's very clearly just wrong.

I don't know what other context you need, but you can ask for it. All of my configuration is linked there. Snippets of errors means that there are errors, right?

brandond commented 1 month ago

RKE2 has no way of knowing in advance of what IPs hetzner is going to set for the node. The kubelet certificates need to be generated before the kubelet can start. Once the kubelet has certs and starts, the Node resource is created, and the Hetzner CCM alters the node IPs. We do not have any way to regenerate the kubelet's serving cert on the fly to add the new IPs.

Snippets of errors means that there are errors, right?

Yes but its not helpful to just say that there are errors. I don't have access to hetzner, so I can't test any of this myself; you're going to need to provide more info.

even after initialization, I get this error in some situations: Error from server: no preferred addresses found; known addresses: [{Hostname staging-server-1} {ExternalIP 5.161...} {ExternalIP 2a01:...::1}]

In what specific situations do you get that error? What log file or command output is it in? What are you doing when you encounter this error?

If the problem is just related to the IPs in the kubelet server certificate, you could try just setting this in your rke2 config - assuming your hostnames resolve properly:

kube-apiserver-arg:
  - --kubelet-preferred-address-types=Hostname,InternalDNS,ExternalDNS

You might need to also add a HelmChartConfig to pass the same value into the metrics-server args

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-metrics-server
  namespace: kube-system
spec:
  valuesContent: |-
    args:
      - --kubelet-preferred-address-types=Hostname,InternalDNS,ExternalDNS
Ken-Michalak commented 1 month ago

I think I got the cluster working. Thanks for the preferred address part. That pointed me in the right direction, but seriously, this shouldn't be this difficult.

All in all, in order to fix that default, you need make sure that /etc/hosts has the internal ip linked with the hostname, and then add that internal ip to the server's node-ip, tls-san, and advertise-address. Agents also need node-ip.

Here's the details of that:

The only thing that gets the internal ip set correctly from the start is overriding 127.0.1.1 {fqdn} {hostname} in /etc/hosts to use the internal ip. (node-ip does something, but it doesn't actually fix the node's ip.)

That doesn't fix the apiserver though. I still need to add the ip to the tls-san and advertise-address.

For kubelet-preferred-address-types, RKE2 seems to be overriding the usual k8s default to InternalIP,ExternalIP,Hostname. So I changed it back to Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP, I'm not sure if that's necessary or not.

I with the networking option enabled in hccm, and those changes, most things were working, except the metrics server. The logs shows tls errors on both internal ips: E0628 18:27:40.638087 1 scraper.go:149] "Failed to scrape node" err="Get \"https://10.188.1.1:10250/metrics/resource\": tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, 5.161..., 2a01:...::1, not 10.188.1.1" node="staging-agent-1"

I added node-ip back in, for both the server and agent, and it does go through. So apparently node-ip just changes the kubelet certificate. It doesn't seem to help anything else, but that appears to be the last missing piece.

For your comments:

even after initialization, I get this error in some situations: Error from server: no preferred addresses found; known addresses: [{Hostname staging-server-1} {ExternalIP 5.161...} {ExternalIP 2a01:...::1}]

In what specific situations do you get that error? What log file or command output is it in? What are you doing when you encounter this error?

It's from calls to the api server, but I didn't see any rhyme or reason for which ones worked and which didn't. I was hoping someone recognized the error. It was in the helm crd installs and metrics logs, as well as some, but not all kubectl commands.

RKE2 has no way of knowing in advance of what IPs hetzner is going to set for the node.

I understand that, but I also can't imagine any situation where you would want it to use an external ip as the one and only internal ip in all of the components, just because it's listed first, or it's on the default route. (Isn't a default route what goes to the internet instead of a range of internal ips?) Maybe this is an oversight in k8s overall, but I feel like it's still just a bad assumption, especially when there are standards for private address ranges. hostname -I shows 3 addresses on boot. The 1st ipv4 and the ipv6 are in the cert, but not the other ipv4. And then even though it's dual stack, the ipv6 isn't even included in node until initialization. It doesn't make a lot of sense.

If the ccm is responsible for setting the IPs, why does anything need to know in advance what the ip is going to be? Why do I need to duplicate that logic in a script, if the ccm is going to set it later? If changing the ip during initialization breaks things, then maybe the ccm shouldn't actually be changing ips at that point. Maybe this isn't on the RKE2 side. I don't know, but it's still broken.

At the very least, I'd expect to be able to just set a single parameter in the RKE2 config and have it know what all needs to be overridden to function. Ideally, the parameter would be a list of cidrs for internal addresses, like what was brought up here: https://github.com/rancher/rke2/issues/591

brandond commented 1 month ago

I understand that, but I also can't imagine any situation where you would want it to use an external ip as the one and only internal ip in all of the components, just because it's listed first, or it's on the default route.

Hetzner's network layout is somewhat unusual amongst cloud providers. The major cloud providers that participate in Kubernetes development almost universally set up hosts with a private IP that is bound to an interface on the node, and if an external IP is allocated configure a 1:1 NAT mapping between the external and internal IP. This is how the Kubelet and Kubernetes cloud provider framework expect internal and external IPs to work - it does not expect for there to be actual public and private interfaces on the node itself. That's why the kubelet defaults to picking the internal IP by looking at the interface with the default route - it expects other secondary interfaces to be on stub networks that don't go anywhere, and would not be mapped to the external IP.

If you've used AWS or GCP or Azure this should all sound pretty familiar.