Closed telmich closed 2 months ago
To follow up on #3075, there does indeed seem to be a bug within kubeadm even with a default route present:
[08:33] server123.place10:~# ip route add default via 2001:1700:3500:2::11
[08:39] server123.place10:~# kubeadm token create --print-join-command
route ip+net: no such network interface
To see the stack trace of this error execute with --v=5 or higher
[08:43] server123.place10:~# kubeadm token create --print-join-command --v=5
I0827 08:45:57.725916 13142 token.go:119] [token] validating mixed arguments
I0827 08:45:57.725976 13142 token.go:128] [token] getting Clientsets from kubeconfig file
I0827 08:45:57.726005 13142 cmdutil.go:94] Using kubeconfig file: /etc/kubernetes/admin.conf
I0827 08:45:57.727655 13142 token.go:243] [token] loading configurations
I0827 08:45:57.728166 13142 initconfiguration.go:114] skip CRI socket detection, fill with the default CRI socket unix:///var/run/containerd/containerd.sock
I0827 08:50:02.451174 13142 interface.go:432] Looking for default routes with IPv4 addresses
I0827 08:50:02.451231 13142 interface.go:437] Default route transits interface "*"
route ip+net: no such network interface
[08:50] server123.place10:~# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"30", GitVersion:"v1.30.2", GitCommit:"39683505b630ff2121012f3c5b16215a1449d5ed", GitTreeState:"archive", BuildDate:"2024-07-03T09:11:54Z", GoVersion:"go1.22.5", Compiler:"gc", Platform:"linux/amd64"}
[08:52] server123.place10:~#
It seems not only does kubeadm need a default route, but also seem to need an IPv4 default route, which is not present in IPv6 only networks anyway.
I verified this on a second node where the behaviour is identical with default route present:
[08:46] server122.place10:/usr/local/bin# ip route add default via 2001:1700:3500:2::1; kubeadm token create --print-join-command --v=5; ip route del default via 2001:1700:3500:2::1
I0827 08:47:43.972773 315 token.go:119] [token] validating mixed arguments
I0827 08:47:43.972863 315 token.go:128] [token] getting Clientsets from kubeconfig file
I0827 08:47:43.972893 315 cmdutil.go:94] Using kubeconfig file: /etc/kubernetes/admin.conf
I0827 08:47:43.974538 315 token.go:243] [token] loading configurations
I0827 08:47:43.974970 315 initconfiguration.go:114] skip CRI socket detection, fill with the default CRI socket unix:///var/run/containerd/containerd.sock
I0827 08:52:17.932887 315 interface.go:432] Looking for default routes with IPv4 addresses
I0827 08:52:17.932950 315 interface.go:437] Default route transits interface "*"
route ip+net: no such network interface
[08:52] server122.place10:/usr/local/bin# kubead
-ash: kubead: not found
[08:52] server122.place10:/usr/local/bin#
[08:52] server122.place10:/usr/local/bin#
[08:52] server122.place10:/usr/local/bin#
[08:52] server122.place10:/usr/local/bin#
[08:52] server122.place10:/usr/local/bin#
[08:52] server122.place10:/usr/local/bin#
[08:52] server122.place10:/usr/local/bin# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"30", GitVersion:"v1.30.2", GitCommit:"39683505b630ff2121012f3c5b16215a1449d5ed", GitTreeState:"archive", BuildDate:"2024-07-03T09:11:54Z", GoVersion:"go1.22.5", Compiler:"gc", Platform:"linux/amd64"}
[08:53] server122.place10:/usr/local/bin#
And on both nodes it works with the same workaround as before:
[08:54] server123.place10:~# kubeadm token create --print-join-command --v=5 --config ./kubeadm-init-only.yaml
I0827 08:54:47.068597 15472 token.go:119] [token] validating mixed arguments
I0827 08:54:47.068664 15472 token.go:128] [token] getting Clientsets from kubeconfig file
I0827 08:54:47.068692 15472 cmdutil.go:94] Using kubeconfig file: /etc/kubernetes/admin.conf
I0827 08:54:47.070134 15472 token.go:243] [token] loading configurations
I0827 08:54:47.070154 15472 initconfiguration.go:260] loading configuration from "./kubeadm-init-only.yaml"
I0827 08:54:47.070736 15472 initconfiguration.go:114] skip CRI socket detection, fill with the default CRI socket unix:///var/run/containerd/containerd.sock
I0827 08:54:47.070772 15472 kubelet.go:196] the value of KubeletConfiguration.cgroupDriver is empty; setting it to "systemd"
I0827 08:54:47.072263 15472 version.go:187] fetching Kubernetes version from URL: https://dl.k8s.io/release/stable-1.txt
I0827 08:54:47.283390 15472 version.go:256] remote version is much newer: v1.31.0; falling back to: stable-1.30
I0827 08:54:47.283471 15472 version.go:187] fetching Kubernetes version from URL: https://dl.k8s.io/release/stable-1.30.txt
I0827 08:54:47.630326 15472 token.go:252] [token] creating token
kubeadm join [2a0a:e5c0:10:1::123]:6443 --token ..... --discovery-token-ca-cert-hash sha256:....
[08:54] server123.place10:~# cat kubeadm-init-only.yaml
kind: InitConfiguration
apiVersion: kubeadm.k8s.io/v1beta3
localAPIEndpoint:
advertiseAddress: 2a0a:e5c0:10:1::123
bindPort: 6443
Same on the other node:
[08:54] server122.place10:~# kubeadm token create --print-join-command --v=5 --config ./kubeadm-init-only.yaml
I0827 08:54:41.827711 2019 token.go:119] [token] validating mixed arguments
I0827 08:54:41.827783 2019 token.go:128] [token] getting Clientsets from kubeconfig file
I0827 08:54:41.827805 2019 cmdutil.go:94] Using kubeconfig file: /etc/kubernetes/admin.conf
I0827 08:54:41.829328 2019 token.go:243] [token] loading configurations
I0827 08:54:41.829346 2019 initconfiguration.go:260] loading configuration from "./kubeadm-init-only.yaml"
I0827 08:54:41.830228 2019 initconfiguration.go:114] skip CRI socket detection, fill with the default CRI socket unix:///var/run/containerd/containerd.sock
I0827 08:54:41.830268 2019 kubelet.go:196] the value of KubeletConfiguration.cgroupDriver is empty; setting it to "systemd"
I0827 08:54:41.831398 2019 version.go:187] fetching Kubernetes version from URL: https://dl.k8s.io/release/stable-1.txt
I0827 08:54:42.307952 2019 version.go:256] remote version is much newer: v1.31.0; falling back to: stable-1.30
I0827 08:54:42.308073 2019 version.go:187] fetching Kubernetes version from URL: https://dl.k8s.io/release/stable-1.30.txt
I0827 08:54:42.641422 2019 token.go:252] [token] creating token
kubeadm join [2a0a:e5c0:10:1::122]:6443 --token ..... --discovery-token-ca-cert-hash sha256:...
[08:54] server122.place10:~# cat kubeadm-init-only.yaml
kind: InitConfiguration
apiVersion: kubeadm.k8s.io/v1beta3
localAPIEndpoint:
advertiseAddress: 2a0a:e5c0:10:1::122
bindPort: 6443
However, we are running 30+ k8s clusters without default routes, because these k8s clusters are providing routing services to various infrastructures. Usually these are very small clusters, consisting of 1-4 nodes each, providing BGP, firewalling, NAT64 towards the infrastructure.
I would update the documentation above to reflect that newer versions of kubeadm can work without a default route, but that having a default route is recommended for most cases.
i don't think we want to update the documentation or modify kubeadm, because kubeadm is aligned with the IP detection mechanism of all k8s components. they all use default route.
also, @uablrek @aojea do you remember that kubernetes/kubernetes ticket where we discussed that k8s does need a default route for core features. was it related to Services?
b) add an option to manually specific the IP address
And on both nodes it works with the same workaround as before:
@telmich this is already supported by passing IPs to everything. this is not a workaround. this is the expected way, but topology wise it's not recommended.
this page has a note that is clear about that https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#network-setup
The Kubernetes project recommends against this approach (configuring all component instances with custom IP addresses).
- b) add an option to manually specific the IP address
You can and must do this, if you are setting your own routing rules, it means you are a power user and you are in control of the network, so you must do the corresponding IP planning and assignment
Maybe the question is, how can I pass the IPs to the kubernetes components with kubeadm? if that is the case can you please specify which components or all of them?
@uablrek @aojea do you remember that kubernetes/kubernetes ticket where we discussed that k8s does need a default route for core features. was it related to Services?
Yes. Ref https://github.com/kubernetes/kubernetes/issues/123120
Maybe the question is, how can I pass the IPs to the kubernetes components with kubeadm?
there was this blog post on how to configure all components with kubeadm, but it did not get to a good state to be merged: https://github.com/kubernetes/website/pull/28331 also IMO this is not something we should advise with documentation or blog posts.
also, @uablrek @aojea do you remember that kubernetes/kubernetes ticket where we discussed that k8s does need a default route for core features. was it related to Services?
here it is: https://github.com/kubernetes/kubernetes/issues/123120
(EDIT: was just posted above)
@telmich Check if the bogus default route mentioned in https://github.com/kubernetes/kubernetes/issues/123120#issuecomment-1925700626 can be used as a work-around.
ip ro add default dev lo
i don't think we want to update the documentation or modify kubeadm, because kubeadm is aligned with the IP detection mechanism of all k8s components. they all use default route.
Can you elaborate on this a little bit? I don't understand how any component of k8s actually requires a default route, because de-facto we have many k8s clusters running, healthy without a default route.
We did not do any kind of tuning so far and all k8s components just work, the only exception is so far with kubeadm.
@telmich this is already supported by passing IPs to everything. this is not a workaround. this is the expected way, but topology wise it's not recommended.
I disagree with that as kubeadm does not support something like --address or --bind-address. The only, rather awkward way to make it work without a default route is by passing in above defined kubeadm-init-only.yaml.
- b) add an option to manually specific the IP address
You can and must do this, if you are setting your own routing rules, it means you are a power user and you are in control of the network, so you must do the corresponding IP planning and assignment
There is a difference between routing (where to go) and addressing (which source address to use). They are related, but they are sometimes mixed up.
The routes on all systems work as well as do the addresses. Using ssh, curl, etc. all of these tools work on the machine. With one exception - that is why I created this feature request.
Maybe the question is, how can I pass the IPs to the kubernetes components with kubeadm? if that is the case can you please specify which components or all of them?
Maybe you can help me on this one: I actually don't understand why kubeadm is failing in the first place.
I can curl the kube-apiserver, I can reach all kube components with curl without setting any kind of parameters. I honestly don't understand why kubeadm fails to connect in the first place. From an OS point of view, everything is working normally.
i don't think we want to update the documentation or modify kubeadm, because kubeadm is aligned with the IP detection mechanism of all k8s components. they all use default route.
Can you elaborate on this a little bit? I don't understand how any component of k8s actually requires a default route, because de-facto we have many k8s clusters running, healthy without a default route.
how are you configurating bind addresses for them?
We did not do any kind of tuning so far and all k8s components just work, the only exception is so far with kubeadm.
all k8s components have the same IP detection mechanism. it is summarized in this note:
If two or more default gateways are present on the host, a Kubernetes component will try to use the first one it encounters that has a suitable global unicast IP address. While making this choice, the exact ordering of gateways might vary between different operating systems and kernel versions.
Maybe the question is, how can I pass the IPs to the kubernetes components with kubeadm?
there was this blog post on how to configure all components with kubeadm, but it did not get to a good state to be merged: kubernetes/website#28331 also IMO this is not something we should advise with documentation or blog posts.
also, @uablrek @aojea do you remember that kubernetes/kubernetes ticket where we discussed that k8s does need a default route for core features. was it related to Services?
here it is: kubernetes/kubernetes#123120
(EDIT: was just posted above)
Thanks a lot for the information! I've read that issue and while adding the default route seems to fix the particular problem, nothing I find in the ticket actually states that something requires a default route, besides then the link to the kubeadm documentation.
From a network and OS perspective I would also claim that a default route is generally speaking unnecessary. What is required is connectivity between the nodes. Even the described kube-proxy issue might just be an implementation issue, which might not even be there with a CNI such as calico that can do bgp peering.
What I really want to say with this is, I don't think a default route is actually necessary for kubeadm nor any of the k8s components. There might currently be dependencies on it in one or the other implementation way, but technically, networking wise, it is not needed.
@telmich this is already supported by passing IPs to everything. this is not a workaround. this is the expected way, but topology wise it's not recommended.
I disagree with that as kubeadm does not support something like --address or --bind-address. The only, rather awkward way to make it work without a default route is by passing in above defined kubeadm-init-only.yaml.
the way to configure them all is by using the kubeadm configuration file. check the linked blog post PR. kubeadm exposes ways to configure flags for given components.
EDIT: or this page: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/control-plane-flags/
the IP detection that kubeadm does is the same as the kube-apiserver. if you pass the advertiseaddress field this detection is skipped as kubeadm thinks you know the address that you want.
i don't think we want to update the documentation or modify kubeadm, because kubeadm is aligned with the IP detection mechanism of all k8s components. they all use default route.
Can you elaborate on this a little bit? I don't understand how any component of k8s actually requires a default route, because de-facto we have many k8s clusters running, healthy without a default route.
how are you configurating bind addresses for them?
Address binding is not depending on routing.
I think what you mean is "how to decide on which address to bind" and the default answer to that is: bind to the ANY address (:: and 0.0.0.0).
That's the default behaviour of virtually any network server.
Maybe also to clarify, there might be 2 issues here, not sure how mixed we talk about them:
In both cases, this is usually best left to the operating system and should not be chosen by the application, unless
Because there are cases when a host has multiple addresses and the software needs to select a specific one, not left to the kernel.
We did not do any kind of tuning so far and all k8s components just work, the only exception is so far with kubeadm.
all k8s components have the same IP detection mechanism. it is summarized in this note:
If two or more default gateways are present on the host, a Kubernetes component will try to use the first one it encounters that has a suitable global unicast IP address. While making this choice, the exact ordering of gateways might vary between different operating systems and kernel versions.
Interesting, that is a bit the opposite case, having multiple default routes.
From what I recall coding in C (maybe this is diffferent in golang?), NOT bind()ing to an IP address, but just using connect() does the expected thing: https://stackoverflow.com/questions/15673846/how-to-give-to-a-client-specific-ip-address-in-c
Is there a limitation / requirement in go that forces kubeadm to bind in the first place?
@telmich this is already supported by passing IPs to everything. this is not a workaround. this is the expected way, but topology wise it's not recommended.
I disagree with that as kubeadm does not support something like --address or --bind-address. The only, rather awkward way to make it work without a default route is by passing in above defined kubeadm-init-only.yaml.
the way to configure them all is by using the kubeadm configuration file. check the linked blog post PR. kubeadm exposes ways to configure flags for given components.
EDIT: or this page: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/control-plane-flags/
the IP detection that kubeadm does is the same as the kube-apiserver. if you pass the advertiseaddress field this detection is skipped as kubeadm thinks you know the address that you want.
I just checked the generated manifest that kubeadm created and it contains:
kube-apiserver --advertise-address=2a0a:e5c0:10:1::122
And the initial configuration passed to kubeadm did in fact contain:
kind: InitConfiguration
apiVersion: kubeadm.k8s.io/v1beta3
localAPIEndpoint:
advertiseAddress: 2a0a:e5c0:10:1::122
bindPort: 6443
... which seems to somewhat explain why passing in the Initconfiguration makes kubeadm work on non-default route enabled systems (?), because it uses the InitConfiguration to set its source address?
Sorry, slightly puzzled on this one.
However, we are running 30+ k8s clusters without default routes
i'm assuming these are non-kubeadm clusters, so you must be configuring components in them somehow. if you are passing explicit IP address to kube-apiservers in such clusters you are bypassing the kube-apisever IP detection.
if you want to use kubeadm and skip this detection just pass the config with advertiseAddress.
... which seems to somewhat explain why passing in the Initconfiguration makes kubeadm work on non-default route enabled systems (?), because it uses the InitConfiguration to set its source address?
https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/
--advertise-address string
The IP address on which to advertise the apiserver to members of the cluster. This address must be reachable by the rest of the cluster. If blank, the --bind-address will be used. If --bind-address is unspecified, the host's default interface will be used.
--bind-address string Default: 0.0.0.0
The IP address on which to listen for the --secure-port port. The associated interface(s) must be reachable by the rest of the cluster, and by CLI/web clients. If blank or an unspecified address (0.0.0.0 or ::), all interfaces and IP address families will be used.
comment is a bit misleading, but this is where the autodetection as per the kubeadm network docs comes into play.
@telmich there are a lot of things under the hood that are not easy to infer. I'm happy to talk offline in a meeting or slack so it is easier to ask questions and get context ,
closing as documentation covers the status quo.
FYI, this is CNI-plugin dependent (so "any" in the header is wrong). Installing without default route on some CNI-plugins:
In all cases a warning is printed:
W0830 05:57:46.291946 279 common.go:199] WARNING: could not obtain a bind address for the API Server: no default routes found in "/proc/net/route" or "/proc/net/ipv6_route"; using: 0.0.0.0
And BTW, the bogus default route (https://github.com/kubernetes/kubeadm/issues/3102#issuecomment-2312174483) doesn't work.
What keywords did you search in kubeadm issues before filing this one?
I've checked previous tickets, specfically #3075 in which this issue was last discussed.
FEATURE REQUEST
Request
Add support for kubeadm to work without default route.
Background
I am aware of https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#network-setup saying we need a default route and I understand that there might be regressions running k8s without one.
However, we are running 30+ k8s clusters without default routes, because these k8s clusters are providing routing services to various infrastructures. Usually these are very small clusters, consisting of 1-4 nodes each, providing BGP, firewalling, NAT64 towards the infrastructure.
Technical considerations
One of the main reasons kubeadm currently fails is due to the selection of the IP address. From a network perspective, I see two very easy solutions to this:
In my opinion, (a) should be the default and (b) is already supported for many tools such as ping (using -I), ssh (using -B),
Follow up tasks
I would update the documentation above to reflect that newer versions of kubeadm can work without a default route, but that having a default route is recommended for most cases.
Versions
kubeadm version (use
kubeadm version
): kubeadm version: &version.Info{Major:"1", Minor:"30", GitVersion:"v1.30.2", GitCommit:"39683505b630ff2121012f3c5b16215a1449d5ed", GitTreeState:"archive", BuildDate:"2024-07-03T09:11:54Z", GoVersion:"go1.22.5", Compiler:"gc", Platform:"linux/amd64"}(however seems to apply to all kubeadm versions)
Environment:
kubectl version
): anyuname -a
): anyWhat happened?
Trying to print the join command or running
kubeadm upgrade apply ...
fails on hosts without a default route.What you expected to happen?
I expect it to work, as the hosts in question have the full internet routing table:
They can reach any host, they just don't have a default route.
How to reproduce it (as minimally and precisely as possible)?
Create a node, remove the default route, keep required routes for pulling images.
Anything else we need to know?
Kubernetes provides a very good framework including managing network components. In many cases pods running on routers are actually running on the HostNetwork, which admittedly is not the default case, is a very useful use case.
We are running data centers all around the world and have been running routing services inside kubernetes now for almost 2 years. Upgrading kubeadm clusters without default route is a bit more dangerous and complicated then other systems because of the default route requirement.
So a typical upgrade or joining other nodes flow is at the moment:
So using kubeadm with a default route is possible, but dangerous and more complex, because routes are important and we don't want incorrect packets to be delivered to the wrong router (which we need to specific using the default route)
If there is any work needed in regards to selecting the right IP address or reasoning about the logic, I can be helping there. I am just not familiar with go/the kubeadm codebase.