Open grzesuav opened 3 months ago
The bug fix for this has been rolled out
@kevinkrp93 can you elaborate a bit in which version is it fixed ? And share some details how it was fixed actually?
team, can we get details in how the fix was delivered to confirm the issue is actually in our clusters ?
@grzesuav - It was a bug in compute which we fixed. Are you still facing this issue?
Describe the bug Cluster autoscaler was blocked as cluster state was unhealthy. Cluster state was unhealthy as more than 30% of nodes were unhealthy. All of the unhealthy nodes were just dummy entries in API server, they were preempted spot instances. From some reasons objects in API server were not deleted.
To Reproduce N/A
Expected behavior Spot node is removed from API server after preemption
Screenshots N/A
Environment (please complete the following information):
Additional context It happened first time for me, in eastus2 region