Closed uristernik closed 8 months ago
This repo doesn't provide an end user install manifest so I'm assuming you are using it from the Calico docs. I think you should be able to change the deployment as you see fit, even if you're using the helm chart. I think this is probably happening because we assume the operator will be deployed on a cluster that will not have pod networking and therefore needs to tolerate NoExecute and NoSchedule so that it can be deployed on nodes without pod networking.
Do you have any suggestion on how we could support installing on clusters that start with no networking and still be able to avoid Cordoned nodes?
I'm wondering if we could put an (anti-)affinity to avoid nodes that were cordoned? And as long as it was an affinity (not required) then even when there were no nodes that matched the affinity the pod would still be scheduled.
Yes, I understand but the behaviour we are seeing is seeing is that the scheduler prefers scheduling on cordoned node. When we have multiple cordoned node the tigera-operator will move from one cordoned node to another (the cordoned nodes don't have the the NoExecute nor NoSchedule taints).
I wonder if that's the expected behaviour, and if others are experiencing it too.
the cordoned nodes don't have the the NoExecute nor NoSchedule taints
That sounds unexpected to me, from what I've found (searching) is that cordoning a node should mark/taint it Unschedulable. I still do not think this would impact where the tigera-operator is scheduled though, since it tolerates Unschedulable.
Tomorrow we are migrating another cluster to Karpenter, I'll attach as much info as I can. Just to clarify:
I was wrong in my last message, I in fact do see the match in taint and toleration. But we are still seeing it hoping from one cordoned node to another until there's no more cordoned nodes in the cluster. Did you ever see this kind of behaviour?
I've seen the same during EKS upgrades, but it doesn't happen on all clusters, so there is some randomness to it. Since we use AWS VPC CNI and nodes come up with networking setup, I'm planning to remove these tolerations.
@tomsucho Any progress on removing these tolerations? Having a hardcoded NoSchedule
on the operator breaks a lot of workflows sadly. At least this should be configurable in the helm chart.
@ayeks in 3.25.1 I can only see the default tolerations on the tigera deployment pod as follows and those should be good I think:
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
Installing using helm chart defaults results in:
tolerations:
- effect: NoExecute
operator: Exists
- effect: NoSchedule
operator: Exists
resulting in cordoned nodes hopping. Overriding the chart values to:
tolerations:
- effect: NoSchedule
operator: Exists
key: node.kubernetes.io/not-ready
seems to fix it.
The helm chart has tolerations that can be changed and if using the deployment manifest the tolerations could be changed for those that need that. But I think the default needs to remain of tolerating NoExecute and NoSchedule.
To avoid cordoned nodes I think it would be acceptable to add a preferred affinity of Schedulable nodes, then operator could still be deployed if there were none (like at installation time when there might not be any Schedulable nodes). I'm not sure if it is possible to affinitize for Schedulable, has anyone looked into that? I'd be happy to review a PR to the operator chart in the calico repo with a change like that.
Going to close this as tolerations can be configured in the helm chart, and this PR adds affinity: https://github.com/projectcalico/calico/pull/8095
Expected Behavior
When draining a node that tigera-operator is scheduled on we expected tigera-operator pod to be terminated and scheduled on a node that is not cordoned.
Current Behavior
Tigera-operator got scheduled only on cordoned nodes. Even when killing forcefully the pod or restarting the deployment.
Steps to Reproduce (for bugs)
Context
We saw this issue reproduce twice:
Your Environment
EKS version 1.22 Tigera-operator image version v1.27.16 Calico image version: v3.23.5 Deploying using helm chart