Open darklight147 opened 4 months ago
Looking at the cluster id you shared, I see logs like creating instance, insufficient capacity, regional on-demand vCPU quota limit for subscription has been reached. To scale beyond this limit, please review the quota increase process here: https://learn.microsoft.com/en-us/azure/quotas/regional-quota-requests
If you do kubectl get events | grep karp
, do you see any events like this?
You can unblock your scaleup by requesting additional quota on the subscription for that region via following the steps in this link https://learn.microsoft.com/en-us/azure/quotas/regional-quota-requests
@Bryce-Soghigian Empty events from the command
Also here is the current Quota sorted by Current usage
@Bryce-Soghigian hey any update on this? thank you 🚀
We are seeing a similar behaviour in our cluster. Node claims are created, but they do not get in to the ready state. We do not see anything special in the events. It only says "Pod should schedule on: nodeclaim/app-g5gcw" and "Cannot disrupt NodeClaim" for existing nodes.
We have also checked that we have not reached our quota.
Ref: #438 running az aks update -n cluster -g rg
fixed the issue for us.
We are still seeing this behavior time to time where nodeclaims are created resulting in creation of new VMs, but they do not manage to join the cluster and get in to a Ready = true
state. How do we debug this or provide you with logs? The only solution is to reconcile the state by running an empty update against the cluster.
Version
Karpenter Version: v0.0.0
Kubernetes Version: v1.0.0
Expected Behavior
Create a new node
Actual Behavior
Show the above message when describing a Pod but doesn't create any new Nodes
Steps to Reproduce the Problem
AKS Cluster with node auto provisioning enabled
Scale a deployment Nginx for example to 20 with memory request 8Gi
Resource Specs and Logs
Community Note