Table of Contents generated with DocToc

EKS Rolling Update

EKS Rolling Update is a utility for updating the launch configuration or template of worker nodes in an EKS cluster.

Intro

EKS Rolling Update is a utility for updating the launch configuration or template of worker nodes in an EKS cluster. It updates worker nodes in a rolling fashion and performs health checks of your EKS cluster to ensure no disruption to service. To achieve this, it performs the following actions:

Pauses Kubernetes Autoscaler (Optional)
Finds a list of worker nodes that do not have a launch config or template that matches their ASG
Scales up the desired capacity
Ensures the ASGs are healthy and that the new nodes have joined the EKS cluster
Cordons the outdated worker nodes
Suspends AWS Autoscaling actions while update is in progress
Drains outdated EKS outdated worker nodes one by one
Terminates EC2 instances of the worker nodes one by one
Detaches EC2 instances from the ASG one by one
Scales down the ASG to original count (in case of failure)
Resumes AWS Autoscaling actions
Resumes Kubernetes Autoscaler (Optional)

Requirements

kubectl installed
KUBECONFIG environment variable set, or config available in ${HOME}/.kube/config per default
AWS credentials configured

IAM Requirements

The following IAM permissions are required:

autoscaling:DescribeAutoScalingGroups
autoscaling:TerminateInstanceInAutoScalingGroup
autoscaling:SuspendProcesses
autoscaling:ResumeProcesses
autoscaling:UpdateAutoScalingGroup
autoscaling:CreateOrUpdateTags
autoscaling:DeleteTags
ec2:DescribeLaunchTemplates
ec2:DescribeInstances

RBAC permissions

The following RBAC permission rules are required for graceful termination:

- apiGroups: [""]
  resources: ["pods/eviction"]
  verbs: ["create"]

- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list"]

- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["get", "patch", "list", "watch"]

- apiGroups: ["apps"]
  resources: ["statefulsets", "daemonsets", "deployments", "replicasets"]
  verbs: ["get", "list"]

Installation

From source

virtualenv -p python3 venv
source venv/bin/activate
pip3 install -r requirements.txt

Usage

usage: eks_rolling_update.py [-h] --cluster_name CLUSTER_NAME [--plan]

Rolling update on cluster

optional arguments:
  -h, --help            show this help message and exit
  --cluster_name CLUSTER_NAME, -c CLUSTER_NAME
                        the cluster name to perform rolling update on
  --plan, -p            perform a dry run to see which instances are out of
                        date

Example:

eks_rolling_update.py -c my-eks-cluster

We recommend running the tool as (cron)job within your EKS cluster (see Terraform example), as this allows for easy RBAC permission management.

Container images

Container images for this project are made available as GitHub Packages.

You can run them using docker or your preferred container runtime:

docker run -ti --rm \
  -e AWS_DEFAULT_REGION \
  -v "${HOME}/.aws:/root/.aws" \
  -v "${HOME}/.kube/config:/root/.kube/config" \
  ghcr.io/deinstapel/eks-rolling-update:edge \
  -c my-cluster

Pass in any additional environment variables and options as described elsewhere in this file.

Terraform example

A terraform example on how to deploy eks-rolling-update as a Kubernetes CronJob in your EKS cluster can be found here. Adjust according to your needs.

Configuration

Core Configuration

Environment Variable	Description	Default
RUN_MODE	Overall strategy for handling multiple ASGs & identifying nodes to roll. See Run Modes section below	1
DRY_RUN	If True, only a query will be run to determine which worker nodes are outdated without running an update operation	False
CLUSTER_HEALTH_WAIT	Number of seconds to wait after ASG has been scaled up before checking health of nodes with the cluster	90
CLUSTER_HEALTH_RETRY	Number of attempts to validate the health of the cluster after ASG has been scaled	1
GLOBAL_MAX_RETRY	Number of attempts of a node health or instance termination check	12
GLOBAL_HEALTH_WAIT	Number of seconds to wait before retrying a health node health or instance termination check	20
BETWEEN_NODES_WAIT	Number of seconds to wait after removing a node before continuing on	0

ASG & Node-Related Controls

Environment Variable	Description	Default
ASG_DESIRED_STATE_TAG	Temporary tag which will be saved to the ASG to store the state of the EKS cluster prior to update	eks-rolling-update:desired_capacity
ASG_ORIG_CAPACITY_TAG	Temporary tag which will be saved to the ASG to store the state of the EKS cluster prior to update	eks-rolling-update:original_capacity
ASG_ORIG_MAX_CAPACITY_TAG	Temporary tag which will be saved to the ASG to store the state of the EKS cluster prior to update	eks-rolling-update:original_max_capacity
ASG_NAMES	List of space-delimited ASG names. Out of ASGs attached to the cluster, only these will be processed for rolling update. If this is left empty all ASGs of the cluster will be processed.	""
BATCH_SIZE	# of instances to scale the ASG by at a time. When set to 0, batching is disabled. See Batching section	0
MAX_ALLOWABLE_NODE_AGE	The max age each node allowed to be. This works with `RUN_MODE` 4 as node rolling is updating based on age of node	6
EXCLUDE_NODE_LABEL_KEYS	List of space-delimited keys for node labels. Nodes with a label using one of these keys will be excluded from the node count when scaling the cluster.	spotinst.io/node-lifecycle
ASG_USE_TERMINATION_POLICY	Prefer ASG termination policy (instance terminate/detach handled by ASG according to configured termination policy)	False
INSTANCE_WAIT_FOR_STOPPING	Only wait for terminated instances to be in `stopping` or `shutting-down` state, instead of fully `terminated` or `stopped`	False

K8S Node & Pod Controls

Environment Variable	Description	Default
K8S_AUTOSCALER_ENABLED	If True Kubernetes Autoscaler will be paused before running update	False
K8S_AUTOSCALER_NAMESPACE	Namespace where Kubernetes Autoscaler is deployed	default
K8S_AUTOSCALER_DEPLOYMENT	Deployment name of Kubernetes Autoscaler	cluster-autoscaler
K8S_AUTOSCALER_REPLICAS	Number of replicas to scale back up to after Kubernentes Autoscaler paused	2
K8S_CONTEXT	Context from the Kubernetes config to use. If this is left undefined the `current-context` is used	None
K8S_PROXY_BYPASS	Set to `true` to ignore `HTTPS_PROXY` and `HTTP_PROXY` and disable use of any configured proxy when talking to the K8S API	False
TAINT_NODES	Replace the default cordon-before-drain strategy with `NoSchedule` tainting, as a workaround for K8S < `1.19` prematurely removing cordoned nodes from `Service`-managed `LoadBalancer`s	False
EXTRA_DRAIN_ARGS	Additional space-delimited args to supply to the `kubectl drain` function, e.g `--force=true`. See `kubectl drain -h`	""
ENFORCED_DRAINING	If draining fails for a node due to corrupted `PodDisruptionBudget`s or failing pods, retry draining with `--disable-eviction=true` and `--force=true` for this node to prevent aborting the script. This is useful to get the rolling update done in development and testing environments and should not be used in productive environments since this will bypass checking `PodDisruptionBudget`s	False

Run Modes

There are a number of different values which can be set for the RUN_MODE environment variable.

1 is the default.

Mode Number	Description
1	Scale up and cordon/taint the outdated nodes of each ASG one-by-one, just before we drain them.
2	Scale up and cordon/taint the outdated nodes of all ASGs all at once at the beginning of the run.
3	Cordon/taint the outdated nodes of all ASGs at the beginning of the run but scale each ASG one-by-one.
4	Roll EKS nodes based on age instead of launch config (works with `MAX_ALLOWABLE_NODE_AGE` with default 6 days value).

Each of them have different advantages and disadvantages.

Scaling up all ASGs at once may cause AWS EC2 instance limits to be exceeded
Only cordoning the nodes on a per-ASG basis will mean that pods are likely to be moved more than once
Cordoning the nodes for all ASGs at once could cause issues if new pods needs to start during the process

Batching

EKS Rolling Update can batch scale-out the ASG to progressively reach the desired instance count before it begins draining the nodes.

This is intended for use in cases where a large ASG scale-out may result in instances failing to register with EKS. Such a scenario is more likely to occur with larger ASGs where (for example) a 100 instance ASG may be asked to scale to 200 (temporarily). Users may find that some instances never register, and this causes EKS Rolling Update to hang indefinitely waiting for the registered EKS node count to match the instance count.

If this happens, you may want to consider batching.

For example, if the ASG will be scaled from 100 instances to 200 instances, specifying a batch size of 10 will result in the ASG first scaling to 110, then 120, 130, etc instances until 200 is reached. Once the desired count is reached, the tool will proceed with the normal draining/scale-in operations.

Examples

Plan

$ python eks_rolling_update.py --cluster_name YOUR_EKS_CLUSTER_NAME --plan

Apply Changes

$ python eks_rolling_update.py --cluster_name YOUR_EKS_CLUSTER_NAME

Cluster Autoscaler

If using cluster-autoscaler, you must let eks-rolling-update know that cluster-autoscaler is running in your cluster by exporting the following environment variables:

$ export  K8S_AUTOSCALER_ENABLED=true \
          K8S_AUTOSCALER_NAMESPACE="${CA_NAMESPACE}" \
          K8S_AUTOSCALER_DEPLOYMENT="${CA_DEPLOYMENT_NAME}"

Disable operations on cluster-autoscaler

$ unset K8S_AUTOSCALER_ENABLED

Configure tool via .env file

Rather than using environment variables, you can use a .env file within your working directory to load updater settings. e.g:

$ cat .env
DRY_RUN=1