aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.21k stars 318 forks source link

[EKS]: One-click Full Cluster Upgrade #600

Open mohitanchlia opened 4 years ago

mohitanchlia commented 4 years ago

Automation in EKS as a service has come in only bits and pieces. EKS with managed nodes is not really that useful without having a one click full upgrade where EKS version, aws-node, dns etc. along with the worker nodes get upgraded without running or orchestrating commands manually. This can be broken down into 2 pieces 1) Full cluster upgrade with nodes 2) Only worker node upgrades for ongoing AMI rotation. This is a critical functionality

tabern commented 4 years ago

Renaming this to 'One-click Full Cluster Upgrade'. Today EKS managed nodes supports worker node upgrades for AMI rotation, but this is not yet in the console (https://github.com/aws/containers-roadmap/issues/605). See API documentation here: https://docs.aws.amazon.com/eks/latest/APIReference/API_UpdateNodegroupVersion.html

mtparet commented 4 years ago

I just read the official documentation of AWS to upgrade an eks cluster : we have to manually execute kubectl commands to upgrade critical components of Kubernetes. Really, for a "managed" service, is it a joke ? Even updating my on-premise cluster is easier than updating the called "managed" AWS Kubernetes service.

https://docs.aws.amazon.com/eks/latest/userguide/update-cluster.html

irperez commented 4 years ago

In Azure AKS, this is a single click operation with a version drop down. image

sandrom commented 4 years ago

The upgrade process cant get enough love, please work on this! It is incredibly important for production workloads to have a managed process for this :)

raravena80 commented 4 years ago

+1 for this

ghost commented 4 years ago

+1 EKS need to close the gaps with AKS and GKE

nydalal commented 4 years ago

+1

damscott commented 4 years ago

This feature could impact people working under the assumption that AWS will not modify resources running inside the cluster, notably kube-proxy (#657)

kylecompassion commented 3 years ago

+1

alexey-pankratyev commented 2 years ago

+1

smrutiranjantripathy commented 2 years ago

Hi Team,

There is a sample package eks-one-click-cluster-upgrade in aws-samples which provides similar functionality. This is a cli utility which can be used to carry out upgrade. Please check this package and share your feedback.

kareem-elsayed commented 2 years ago

We still can say that EKS it's not a fully managed k8s engine if we are going for every release to spend time and effort to check the release note for every addon.

MMartyn commented 2 years ago

One other aspect of upgrades that would be great to be a part of this effort would be the ability to configure the upgrade timeout for node groups. Currently if it takes longer than 15 minutes to replace a node it fails the upgrade and rolls back.

Related doc: https://docs.aws.amazon.com/eks/latest/userguide/managed-node-update-behavior.html Relevant part:

Drains the pods from the node. If the pods don't leave the node within 15 minutes and there's no force flag, the upgrade phase fails with a PodEvictionFailure error. For this scenario, you can apply the force flag with the update-nodegroup-version request to delete the pods.

neerajprem commented 1 year ago

+1, we must have it.

eperdeme commented 3 months ago

I'm not really understanding the value here. In a cluster where you've got many addons to make the cluster function such as istio, Argo, external-dns, Prometheus, cert-manager...etc.etc. VPC CNI/kube-proxy bumping is trivial task and easily automated via your gitops managment methods.

Is the target for the customers with more out of the box Kubernetes with a basic set of aws provided addons ?