zalando-incubator / es-operator

Kubernetes Operator for Elasticsearch
353 stars 44 forks source link

[Feature] auto scaling configurations #120

Open AyWa opened 3 years ago

AyWa commented 3 years ago

In a cluster with a huge variance of usage, it is good to be able to set different configuration for auto scaling depending of the size of the cluster.

it would be good to set different minShardsPerNode, maxShardsPerNode, scaleUpCPUBoundary dependings of the size of the cluster. Not sure what would be the correct syntax. But for example adding a rules or overwrites part. And a selector like replicaLte (replica less than). The operator could check the overwrite part and fallback to default if there is none.

Before

  scaling:
    enabled: true
    minReplicas: 2
    maxReplicas: 99
    minIndexReplicas: 1
    maxIndexReplicas: 40
    minShardsPerNode: 3
    maxShardsPerNode: 3
    scaleUpCPUBoundary: 75
    scaleUpThresholdDurationSeconds: 240
    scaleUpCooldownSeconds: 1000
    scaleDownCPUBoundary: 40
    scaleDownThresholdDurationSeconds: 1200
    scaleDownCooldownSeconds: 1200
    diskUsagePercentScaledownWatermark: 80

After

  scaling:
    enabled: true
    minReplicas: 2
    maxReplicas: 99
    minIndexReplicas: 1
    maxIndexReplicas: 40
    minShardsPerNode: 3
    maxShardsPerNode: 3
    scaleUpCPUBoundary: 75
    scaleUpThresholdDurationSeconds: 240
    scaleUpCooldownSeconds: 1000
    scaleDownCPUBoundary: 40
    scaleDownThresholdDurationSeconds: 1200
    scaleDownCooldownSeconds: 1200
    diskUsagePercentScaledownWatermark: 80
    rules:
      - replicaLte: 2
        scaleUpCPUBoundary: 30
      - replicaLte: 4
        scaleUpCPUBoundary: 40
      - replicaLte: 10
        scaleUpCPUBoundary: 60

It is mainly for huge cost optimization. During night a cluster can be very small, but at early morning, the cluster need to be able to scale aggressively, but when cluster start to be big, it can scale slowly

I am willing to implement this feature if it makes sense for this project

otrosien commented 3 years ago

@AyWa Thanks for the suggestion. As horizontal auto-scaling becomes more powerful in the current kubernetes releases we should consider dropping our custom route and tying it back to the HPA. I'm interested in hearing your thoughts on this.