tigera / operator

Kubernetes operator for installing Calico and Calico Enterprise
Apache License 2.0
187 stars 141 forks source link

Fixed CNI bin dir prevents nodes from working on Kubernetes deployment #1301

Open Omar007 opened 3 years ago

Omar007 commented 3 years ago

It's not possible to configure the path to the CNI bin directory for the Calico deployment. Calico puts the files in /opt/cni/bin for Kubernetes deployments.
If the cluster does not use this path, the files end up in the wrong location and as a result the nodes never become ready.

Expected Behavior

The CNI bin directory can be specified on the operator deployment or Installation spec.

Current Behavior

CNI bin directory can only ever be /opt/cni/bin and if a cluster uses anything else, it will fail.

Possible Solution

Proper

Update the operator to expose configuration options to set the CNI bin directory path with (and the config directory wouldn't be a bad idea either tbh)

Workaround

a. Manually copy over the files to the correct location on each node in the cluster. b. Patch Calico Node DaemonSet after the operator has deployed it to change the hostPath.

Steps to Reproduce (for bugs)

  1. Have a cluster where Kubelet and CRI-O use a CNI bin dir other than /opt/cni/bin
  2. Deploy Calico on said Kubernetes cluster
  3. See the Calico Node instances place 3 files in /opt/cni/bin on the nodes
  4. See Kubelet CNI network configuration failure error and nodes never becoming ready as the required binaries aren't placed where they are needed.

Your Environment

tmjd commented 3 years ago

Could you provide details on your configuration and the directories needed? It would be best if this was a detectable Provider and the operator could automatically detect the provider and use the appropriat bin dir.

Omar007 commented 3 years ago

It's a basic kubeadm cluster installation using the provided distro packages. These are configured/compiled to use /usr/lib/cni as the standard directory instead of /opt/cni/bin.

As far as I'm aware Kubernetes does not expose this information. Additionally it's a flag/config option you supply to the Kubelet as well as CRI-O so it could even differ per node (but in most cases I don't think it's customizable to that degree on any operator and is assumed to at least be the same across nodes).

tmjd commented 3 years ago

This seems like something we should expose. We'll want to default it based on the current configuration that we use to set that field. I'm not sure when we will get to this so if someone wants to work on this I would be happy to review a PR.

ananace commented 2 years ago

As a note, I'd like to see this type of configurability too, though in my case being able to set the CNI network config directory is more important than the binary one - as I've got a few bare-metal clusters with multus on them, and therefore need the Calico CNI config to end up in /etc/cni/multus/net.d.

lou-lan commented 2 years ago

Note:

uname -r
5.15.32-1-MANJARO

uname -a
Linux node-calico-1 5.15.32-1-MANJARO #1 SMP PREEMPT Mon Mar 28 09:16:36 UTC 2022 x86_64 GNU/Linux
cat /etc/kubernetes/kubelet.env

# Kubernetes kubelet arguments
#
# The KUBELET_ARGS environment variable is used to provide flags and options to
# kubelet when running kubelet.service.
# See `man 1 kubelet` or `kubelet --help` for further information.
#
# NOTE: When using kubeadm to bootstrap a cluster KUBELET_ARGS will be appended
# to the kubeadm specific environment variables.
KUBELET_ARGS=--cni-bin-dir=/usr/lib/cni

kubelet logs

5月 12 13:49:09 node-calico-1 kubelet[36907]: I0512 13:49:09.812337   36907 cni.go:205] "Error validating CNI config list" configList="{\n  \"name\": \"k8s-pod-network\",\n  \"cniVersion\": \"0.3.1\",\n  \"plugins\": [\n    {\n      \"type\": \"calico\",\n      \"datastore_type\": \"kubernetes\",\n      \"mtu\": 0,\n      \"nodename_file_optional\": false,\n      \"log_level\": \"Info\",\n      \"log_file_path\": \"/var/log/calico/cni/cni.log\",\n      \"ipam\": { \"type\": \"calico-ipam\", \"assign_ipv4\" : \"true\", \"assign_ipv6\" : \"false\"},\n      \"container_settings\": {\n          \"allow_ip_forwarding\": false\n      },\n      \"policy\": {\n          \"type\": \"k8s\"\n      },\n      \"kubernetes\": {\n          \"k8s_api_root\":\"https://10.99.0.1:443\",\n          \"kubeconfig\": \"/etc/cni/net.d/calico-kubeconfig\"\n      }\n    },\n    {\n      \"type\": \"bandwidth\",\n      \"capabilities\": {\"bandwidth\": true}\n    },\n    {\"type\": \"portmap\", \"snat\": true, \"capabilities\": {\"portMappings\": true}}\n  ]\n}" err="[failed to find plugin \"calico\" in path [/usr/lib/cni]]"
kfox1111 commented 1 month ago

This does prevent using on os's that have /opt part of its read only filesystem.