Azure / AKS

Azure Kubernetes Service
https://azure.github.io/AKS/
1.97k stars 308 forks source link

Calico Kube Controller Deployment Not Scheduled on System Pools #2296

Open JasonWhall opened 3 years ago

JasonWhall commented 3 years ago

What happened: Since the update to Calico 3.18.x there have been new deployments added to the cluster for Calico use. Most of these use an affinity rule to schedule the pods on a "System" node pool. For the calico-kube-controllers deployment this affinity rule is not applied

What you expected to happen: I would expect the calico-kube-controllers deployment to use the affinity rule similar to other deployments to prefer scheduling on a system node pool. Example below from the calico-typha deployment:

      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - preference:
              matchExpressions:
              - key: kubernetes.azure.com/mode
                operator: In
                values:
                - system
            weight: 100
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: type
                operator: NotIn
                values:
                - virtual-node
              - key: kubernetes.azure.com/cluster
                operator: Exists

How to reproduce it (as minimally and precisely as possible): Deploy AKS cluster 1.20.5 and check calico-kube-controllers deployment in calico-system namespace

Anything else we need to know?: AFAICT This can potentially stop "User" node pools auto-scaling down to 0 nodes as it has no preference to be re-scheduled elsewhere.

Environment:

ghost commented 3 years ago

Hi JasonWhall, AKS bot here :wave: Thank you for posting on the AKS Repo, I'll do my best to get a kind human from the AKS team to assist you.

I might be just a bot, but I'm told my suggestions are normally quite good, as such: 1) If this case is urgent, please open a Support Request so that our 24/7 support team may help you faster. 2) Please abide by the AKS repo Guidelines and Code of Conduct. 3) If you're having an issue, could it be described on the AKS Troubleshooting guides or AKS Diagnostics? 4) Make sure your subscribed to the AKS Release Notes to keep up to date with all that's new on AKS. 5) Make sure there isn't a duplicate of this issue already reported. If there is, feel free to close this one and '+1' the existing issue. 6) If you have a question, do take a look at our AKS FAQ. We place the most common ones there!

ghost commented 3 years ago

Triage required from @Azure/aks-pm

ghost commented 3 years ago

Action required from @Azure/aks-pm

ghost commented 3 years ago

Issue needing attention of @Azure/aks-leads

ghost commented 3 years ago

Issue needing attention of @Azure/aks-leads

ghost commented 3 years ago

Issue needing attention of @Azure/aks-leads

ghost commented 3 years ago

Issue needing attention of @Azure/aks-leads

ghost commented 3 years ago

Issue needing attention of @Azure/aks-leads

ghost commented 3 years ago

Issue needing attention of @Azure/aks-leads

ghost commented 3 years ago

Issue needing attention of @Azure/aks-leads

ghost commented 3 years ago

Issue needing attention of @Azure/aks-leads

qpetraroia commented 3 years ago

Hi @JasonWhall,

Thanks for this, I just reproduced it and tagged it as a bug.

Thanks!

ath88 commented 1 month ago

This is still an issue - when will it be fixed?