Open till opened 7 months ago
This is not a k0s
problem, it's standard Calico configuration. Please use:
provider: calico
calico:
mode: "bird"
envVars:
IP_AUTODETECTION_METHOD: "interface=eth1"
IP6_AUTODETECTION_METHOD: "interface=eth1"
in your k0s config. K0s should not try to guess or "help", since systems are different in unpredictable ways. As you can see I want eth1
to be used (in my case eth0
is the maintenance network).
As an example, some programs (e.g. early cri-o) checked the default route and selected that interface. I usually have multiple targets for my default route (ECMP), and that didn't work of course. So I had to temporary set a fake default route, and then reset it after the "clever" programs had made their magical "help". No thanks to that!
Is your feature request related to a problem? Please describe.
We had a weird issue where pods on a new k0s cluster were unable to talk to pods on another node/host. It turned out that the auto detection in Calico had somehow guessed the wrong interface.
So instead of using
eth0
like on other clusters, it usedeth1
which is a management network and not supposed to be used for node-to-node communication. This meant that thecalico.vxlan
interface lots all traffic.We tried
tcpdump
etc. which wasn't very helpful. I already created an issue in Calico to find out if there's anything one can do to effectively debug/troubleshoot the tunnel since there doesn't seem to be anything obvious and thecalicoctl
tool is in a state of broken (e.g. regarding the use ofdocker-cli
) and either demands to be executed on nodes directly or works with a$KUBECONFIG
.Describe the solution you would like
I see that currently it's empty by default: https://github.com/k0sproject/k0s/blob/1311fb0b73bb3d99202010f802e486aca5b813d4/pkg/apis/k0s/v1beta1/calico.go#L64
From the docs, it seems like, Calico will use the first interface found: https://docs.tigera.io/calico/latest/networking/ipam/ip-autodetection#autodetection-methods
Why on some clusters this is (the expected)
eth0
and on others it iseth1
is currently unknown to me.I would propose to use
kubernetes-internal-ip
orcan-reach
instead? Maybe some docs on how they can be used would be helpful as well.Describe alternatives you've considered
Configuring this myself.
Additional context
Could be that a bump in Calico is needed for the
kubernetes-internal-ip
one, but I am not sure.