epam / cloud-pipeline

Cloud agnostic genomics analysis, scientific computation and storage platform
https://cloud-pipeline.com
Apache License 2.0
146 stars 59 forks source link

Autoscaling of the high availability service #2639

Open NShaforostov opened 2 years ago

NShaforostov commented 2 years ago

Background

As separate Cloud Pipeline deployments may face periodically with large workload peaks, it would be useful to implement an autoscaling of the system nodes (HA service) - to allow scaling up and down of the service according to the actual workload.

Approach

We shall monitor the state of the system API instances (at least, their RAM and/or CPUs consumption). HA service shall have the minimum limit of instances to run itself. If the consumption exceeds some predefined threshold during some time - new instances on the system needs shall be launched (i.e. HA service shall be scaled up). If the workload is subsided, then additional instances shall be stopped (i.e. HA service shall be scaled down - but to not less than predefined minimal limit of instances).

I suppose that described behavior shall be managed by some new system preferences, e.g.:

Additionally

  1. Each HA service autoscaled action (scaling up or down) should be accompanied by a corresponding email to the admin
  2. Add and show at the GUI (Cluster state page) new labels for the HA service nodes:
    • each label shall show the state of the current running service instance
    • label shall be colorizing according to the current instance consumption, for example - if the consumption is less than 50% - label shall be colorized in green, between 50% and 90% - label shall be colorized in orange, when consumption is over 90% - in red. image
  3. Add a new filter at the GUI (Cluster state page) to show only system service instances: image
    • by this filter only system service instances shall be shown in the nodes list
    • system service instances shall not be displayed when the filter "No run id" is selected
tcibinan commented 2 years ago

Goals

From the technical point of view we would like to achieve the following goals:

  1. Kubernetes cluster shall be autoscaled based on specific deployment utilization.
  2. Autoscaling shall not depend on other Cloud Pipeline services.
  3. Autoscaling shall be expandable in terms of autoscaling triggers and target deployments.
  4. Autoscaling shall support independent multiple deployment autoscaling.
  5. Autoscaling shall not abort most running requests/operations.

Implementation

I suggest using an additional autoscaling service which can horizontally autoscale both kubernetes deployments and kubernetes nodes in order to achieve some predefined target utilization. The following key points give more in-depth understanding of the approach.

  1. Autoscaling service is an independent kubernetes deployment itself.
  2. Autoscaling service deployment is created for each target deployment.
  3. Autoscaling service configuration resides in kubernetes configmap as simple json configuration.

Algorithm

The following autoscaling algorithm can be used by the autoscaling service.

  1. find the deployment
  2. find the corresponding pods
  3. find the corresponding nodes
    • observe all nodes
    • distinguish static and autoscaled nodes
    • manage only autoscaled nodes
    • ignore autoscaled nodes which have non target pods
  4. check triggers
    • disk pressure statuses of target nodes (ex. target statuses number = 0 disk pressure statuses)
    • ram pressure statuses of target nodes (ex. target statuses number = 0 ram pressure statuses)
    • cpu utilization of target nodes (ex. target utilization = 50 +- 10 %. scale up on 60%, scale down on 40%)
    • ram utilization of target nodes (ex. target utilization = 50 +- 10 %. scale up on 60%, scale down on 40%)
    • cluster nodes per target pod coefficient (ex. target coefficient = 100 cluster nodes per 1 target pod)
    • target pods per node coefficient (ex. target coefficient = 2 target pods per 1 node)
    • target pod failures per hour coefficient (ex. target coefficient = 3 pod failures per hour)
  5. check limits
    • minimum trigger duration (ex. trigger is active for 1 minute)
    • minimum pods number (ex. 2 pods minimum)
    • maximum pods number (ex. 10 pods maximum)
    • minimum nodes number (ex. 2 nodes minimum)
    • minimum nodes number (ex. 10 nodes maximum)
    • post scale delay (ex. scale less frequent then once per 5 minutes)
  6. scale up node if needed
    • launch instance
    • attach node
    • set labels
  7. scale up deployment if needed
  8. scale down node if needed
    • drain node
    • terminate instance
  9. scale down deployment if needed

Configuration

The following settings shall be configured for the autoscaling service to work:

  1. kubernetes deployment to manage (ex. cp-api-srv)
  2. kubernetes labels to manage (ex. cloud-pipeline/cp-api-srv)
  3. triggers to check (ex. cpu utilization = 50%)
  4. limits to consider (ex. from 1 to 5 pods/nodes)
  5. cloud instance to scale (ex. instance type, iam role, security groups and etc.)

Questions

  1. Does the autoscaling shall be configurable from Cloud Pipeline GUI?
maryvictol commented 2 years ago

The follow autoscaler parameters were checked: trigger:

rules:

limit:

tcibinan commented 2 years ago

Cherry-picked to release/0.16 via 46ba80cb3ead2e43721a502404b1e3e4949255cf, 4eb26dbcef6227f133d534736054136fd623a82d, a19e73fbee597f09321d7981809ccbcfbc461835, e60fbcded01ef44c52b070e331177936c6f7a5f8 and 90e593c422d65d58d2d6fe2bdec38861e2a3d157.