google / kube-startup-cpu-boost

Kube Startup CPU Boost is a controller that increases CPU resource requests and limits during Kubernetes workload startup time
Apache License 2.0
271 stars 14 forks source link
cpu-boost kubernetes kubernetes-operator

Kube Startup CPU Boost

Kube Startup CPU Boost is a controller that increases CPU resource requests and limits during Kubernetes workload startup time. Once the workload is up and running, the resources are set back to their original values.

Build Version Go Report Card GitHub

Note: this is not an officially supported Google product.


Table of contents

Description

The primary use cases for Kube Startup CPU Boosts are workloads that require extra CPU resources during the startup phase - typically JVM based applications.

The Kube Startup CPU Boost leverages In-place Resource Resize for Kubernetes Pods feature introduced in Kubernetes 1.27. It allows to revert workload's CPU resource requests and limits back to their original values without the need to recreate the Pods.

The increase of resources is achieved by Mutating Admission Webhook. By default, the webhook also removes CPU resource limits if present. The original resource values are set by operator after a given period of time or when the POD condition is met.

Installation

Requires Kubernetes 1.27 or newer with InPlacePodVerticalScaling feature gate enabled.

To install the latest release of Kube Startup CPU Boost in your cluster, run the following command:

kubectl apply -f https://github.com/google/kube-startup-cpu-boost/releases/download/v0.11.0/manifests.yaml

The Kube Startup CPU Boost components run in the kube-startup-cpu-boost-system namespace.

Install with Kustomize

You can use Kustomize to install the Kube Startup CPU Boost with your own kustomization file.

cat <<EOF > kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- https://github.com/google/kube-startup-cpu-boost?ref=v0.11.0
EOF
kubectl kustomize | kubectl apply -f -

Installation on Kind cluster

You can use KIND to get a local cluster for testing.

cat <<EOF > kind-poc-cluster.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: poc
nodes:
- role: control-plane
- role: worker
- role: worker
featureGates:
  InPlacePodVerticalScaling: true 
EOF
kind create cluster --config kind-poc-cluster.yaml

Installation on GKE cluster

You can use GKE Alpha cluster to run against the remote cluster.

gcloud container clusters create poc \
    --enable-kubernetes-alpha \
    --no-enable-autorepair \
    --no-enable-autoupgrade \
    --region europe-central2

Usage

  1. Create StartupCPUBoost object in your workload's namespace

    apiVersion: autoscaling.x-k8s.io/v1alpha1
    kind: StartupCPUBoost
    metadata:
     name: boost-001
     namespace: demo
    selector:
     matchExpressions:
     - key: app.kubernetes.io/name
       operator: In
       values: ["spring-demo-app"]
    spec:
     resourcePolicy:
       containerPolicies:
       - containerName: spring-demo-app
         percentageIncrease:
           value: 50
     durationPolicy:
       podCondition:
         type: Ready
         status: "True"

    The above example will boost CPU requests and limits of a container spring-demo-app in a PODs with app.kubernetes.io/name=spring-demo-app label in demo namespace. The resources will be increased by 50% until the POD Condition Ready becomes True.

  2. Schedule your workloads and observe the results

Features

[Boost target] POD label selector

Define the PODs that will be subject for resource boost with a label selector.

spec:
  selector:
    matchExpressions:
    - key: app.kubernetes.io/name
       operator: In
       values: ["spring-rest-jpa"]

[Boost resources] percentage increase

Define the percentage increase for a target container(s). The CPU requests and limits of selected container(s) will be increase by the given percentage value.

spec:
  containerPolicies:
   - containerName: spring-rest-jpa
     percentageIncrease:
       value: 50

[Boost resources] fixed target

Define the fixed resources for a target container(s). The CPU requests and limits of selected container(s) will be set to the given values. Note that specified requests and limits have to be higher than the ones in the container.

spec:
  containerPolicies:
   - containerName: spring-rest-jpa
     fixedResources:
       requests: "1"
       limits: "2"

[Boost duration] fixed time

Define the fixed amount of time, the resource boost effect will last for it since the POD's creation.

spec:
 durationPolicy:
  fixedDuration:
    unit: Seconds
    value: 60

[Boost duration] POD condition

Define the POD condition, the resource boost effect will last until the condition is met.

  spec:
   durationPolicy:
     podCondition:
       type: Ready
       status: "True" 

Configuration

Kube Startup CPU Boost operator can be configured with environmental variables.

Variable Type Default Description
POD_NAMESPACE string kube-startup-cpu-boost-system Kube Startup CPU Boost operator namespace
MGR_CHECK_INTERVAL int 5 Duration in seconds between boost manager checks for time based boost duration policy
LEADER_ELECTION bool false Enables leader election for controller manager
METRICS_PROBE_BIND_ADDR string :8080 Address the metrics endpoint binds to
HEALTH_PROBE_BIND_ADDR string :8081 Address the health probe endpoint binds to
SECURE_METRICS bool false Determines if the metrics endpoint is served securely
ZAP_LOG_LEVEL int 0 Log level for ZAP logger
ZAP_DEVELOPMENT bool false Enables development mode for ZAP logger
HTTP2 bool false Determines if the HTTP/2 protocol is used for webhook and metrics servers
REMOVE_LIMITS bool true Enables operator to remove container CPU limits during the boost time
VALIDATE_FEATURE_ENABLED bool true Enables validation of required feature gate on operator's startup

Metrics

Kube Startup CPU Boost exposes prometheus metrics to monitor the health of the system and the status of Startup CPU Boosts.

Metric name Type Description Labels
boost_configurations Gauge Number of registered Kube Startup CPU Boost configuration namespace: the namespace of a Kube Startup CPU Boost
boost_containers_total Counter Number of a containers which CPU resources were increased namespace: the namespace of container's POD, boost: name of a Kube Startup CPU Boost that increased container resources
boost_containers_active Gauge Number of a containers which CPU resources and not yet reverted to their original values namespace: the namespace of container's POD, boost: name of a Kube Startup CPU Boost that increased container resources

Scraping: Google Cloud Managed Service for Prometheus

The below PodMonitoring example can be used to scrape Kube Startup CPU Boost metrics with Google Managed Service for Prometheus.

apiVersion: monitoring.googleapis.com/v1
kind: PodMonitoring
metadata:
  labels:
    control-plane: controller-manager
    app.kubernetes.io/name: controller-manager-metrics-monitor
    app.kubernetes.io/instance: controller-manager-metrics-monitor
    app.kubernetes.io/component: metrics
    app.kubernetes.io/created-by: kube-startup-cpu-boost
    app.kubernetes.io/part-of: kube-startup-cpu-boost
  name: controller-manager-metrics-monitor
  namespace: kube-startup-cpu-boost-system
spec:
  selector:
    matchLabels:
      control-plane: controller-manager
  endpoints:
  - port: metrics
    interval: 15s

License

Apache License 2.0