replicatedhq / troubleshoot

Preflight Checks and Support Bundles Framework for Kubernetes Applications
https://troubleshoot.sh
Apache License 2.0
539 stars 92 forks source link

analyzer: make sure there's a valid interface that kubernetes can use #1536

Open adamancini opened 2 months ago

adamancini commented 2 months ago

Describe the rationale for the suggested feature.

An end user may try to implement airgapping by removing the primary interface of the host or by removing routes from the routing table - this may break CNI if there is no interface to use to build a bridge for flannel.

Describe the feature

Detect if there is a valid interface that kubeadm init can use for building CNI.

Describe alternatives you've considered

something along the lines of what happens during kubeadm init phase preflight which can generate errors like from this interface list:

default via 169.254.1.1 dev idrac proto static metric 100
169.254.1.0/24 dev idrac proto kernel scope link src 169.254.1.2 metric 100
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown
This token will expire in 24 hours
cannot use "169.254.1.2" as the bind address for the API Server
To see the stack trace of this error execute with --v=5 or higher
Retry 1/3 exited 1, retrying in 1 seconds...
...
cannot use "169.254.1.2" as the bind address for the API Server
To see the stack trace of this error execute with --v=5 or higher
Retry 2/3 exited 1, retrying in 2 seconds...
...
cannot use "169.254.1.2" as the bind address for the API Server
To see the stack trace of this error execute with --v=5 or higher
Retry 3/3 exited 1, no more retries left.
diamonwiggins commented 2 months ago

Per https://kubernetes.io/docs/concepts/services-networking/service/#custom-endpointslices we should consider interfaces in both loopback and link local ranges to not be valid for install.

Also, this is relevant for kubernetes in general not just for kubeadm based installs. I've updated the title to that effect.

adamancini commented 2 months ago

@diamonwiggins thanks for tracking that down

chris-sanders commented 2 months ago

If we improve this we should be sure to update the Embedded Cluster spec when it's available: Ref https://github.com/replicatedhq/embedded-cluster/pull/579/files

adamancini commented 2 months ago

https://github.com/projectcalico/calico/issues/8481

adamancini commented 2 months ago

https://github.com/kubernetes/kubernetes/issues/123120