oracle / cluster-api-provider-oci

Kubernetes Cluster API Provider for Oracle Cloud Infrastructure
https://oracle.github.io/cluster-api-provider-oci/
Apache License 2.0
38 stars 21 forks source link

nodes taints preventing pods from getting schedueled #373

Closed mouad-eh closed 1 month ago

mouad-eh commented 1 month ago

What happened: I created a cluster using the vanilla template (cluster-template.yaml). The cluster was created successfully however for any pod I create it stays in a pending state forever.

What you expected to happen: I expect pods to be scheduled and ready when created.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?: The reason behind pending state is that pods never get schedueled because of the taints present on the nodes. for the controlplane node, the following taints are present:

node-role.kubernetes.io/control-plane:NoSchedule
node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule

for the worker node, taints are as follows:

node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule

I guess node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule taint should be removed from the nodes once they reach a ready state.

Environment:

shyamradhakrishnan commented 1 month ago

Please installl CNI and Cloud Provider as per doc https://oracle.github.io/cluster-api-provider-oci/gs/create-workload-cluster.html#install-a-cni-provider and https://oracle.github.io/cluster-api-provider-oci/gs/create-workload-cluster.html#install-oci-cloud-controller-manager-and-csi-in-a-self-provisioned-cluster

mouad-eh commented 1 month ago

Installing CNI using the link provided in the documentation gave me the following error:

error: resource mapping not found for name: "calico-kube-controllers" namespace: "kube-system" from "https://docs.projectcalico.org/v3.21/manifests/calico.yaml": no matches for kind "PodDisruptionBudget" in version "policy/v1beta1"
ensure CRDs are installed first

I think I am missing some step where I need to install some CRDs but couldn't find that in the documentation.

shyamradhakrishnan commented 1 month ago

you may need to use a latest version of calico based on your kubernetes version.

mouad-eh commented 1 month ago

yes, using the latest version of calico solved the issue I was having. It seems that PodDisruptionBudget was deprecated starting from v1.25.0.

For the taints, I was missing the OCI Cloud Controler Manager. Once I installed it, the nodes were initialized and the taint was removed (I only worked with on-prem k8s so this CMM thing was new to me).

Thanks for your help.