berops / claudie

Cloud-agnostic managed Kubernetes
https://docs.claudie.io/
Apache License 2.0
642 stars 40 forks source link

Feature: Rollback on autoscaler failing to add new node #1567

Closed Despire closed 1 day ago

Despire commented 2 weeks ago

Description

Autoscaler adds new nodes into a nodepool based on the resource usage within the spawned cluster. Currently there is no retry/error handling on the case where the addition of a node fails (could be due to varying reasons such as quota limits, issues with various registries etc.).

On error the autoscaling event should rollback to the old state, i.e. remove the infrastructure (if spawned).

Exit criteria