Feature (What you would like to be added):
We'd like to make a few modifications that will allow the cluster-autoscaler to down-scale machines that run ETCD pods that are at present preventing any down-scaling. Of course, if the availability is in question (potentially harmful outside the maintenance window), we shouldn't (let's discuss our play then here; the list below is the outcome of a first discussion with @shreyas-s-rao).
Motivation (Why is this needed?):
Costs.
Approach/Hint to the implement solution (optional):
Do not add the cluster-autoscaler.kubernetes.io/safe-to-evict: "false" annotation for the etcd-main-compactor jobs (just because it's in the etcd-main spec)
Remove cluster-autoscaler.kubernetes.io/safe-to-evict: "false" from clustered etcd-main statefulset in general
Terminate the singleton etcd-main statefulset pod if...
...if inside the maintenance time window (with a jitter, so that we do not terminate pods all at once at the full hour, which may lead to prolonged detach/attach times) and...
...if the cluster does not have the purpose "production" and...
...if the requests utilization (CPU or memory) for the node hosting the pod is below 70% (atm.; ideally configurable)
Implementation detail:
Instead of the druid reading the purpose, gardenlet could do that and set a field at the etcd resource that permits or forbids voluntary evictions.
A new controller, let's call it terminator (:-)), could do the above as part of druid.
Feature (What you would like to be added): We'd like to make a few modifications that will allow the cluster-autoscaler to down-scale machines that run ETCD pods that are at present preventing any down-scaling. Of course, if the availability is in question (potentially harmful outside the maintenance window), we shouldn't (let's discuss our play then here; the list below is the outcome of a first discussion with @shreyas-s-rao).
Motivation (Why is this needed?): Costs.
Approach/Hint to the implement solution (optional):
cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
annotation for the etcd-main-compactor jobs (just because it's in the etcd-main spec)cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
from clustered etcd-main statefulset in general