kubernetes / website

Kubernetes website and documentation repo:
https://kubernetes.io
Creative Commons Attribution 4.0 International
4.47k stars 14.39k forks source link

Document cluster resilience around upgrade #42712

Open sftim opened 1 year ago

sftim commented 1 year ago

This is a Feature Request

What would you like to be added Add details about how to maintain resilience for your Kubernetes cluster during an upgrade.

Why is this needed We have https://kubernetes.io/docs/tasks/administer-cluster/cluster-upgrade/#upgrade-other as a task-oriented guide but it doesn't tell you much about the design considerations and constraints.

For example if you upgrade etcd then a 3-node etc cluster is briefly unable to tolerate some forms of partial outage (the surviving node cannot detect that it is the sole healthy survivor). You could set up a witness / fencing mechanism; a more cloud native approach is to scale it out to 5 nodes, do the upgrade and scale in.

Comments Aim to cover the mixed version proxy concept; in fact, we could move that page to be a heading in a new page if we add one.

/sig cluster-lifecycle /sig architecture

/language en /kind feature

sftim commented 1 year ago

@jpbetz FYI

sftim commented 1 year ago

/lifecycle frozen

mehabhalodiya commented 1 year ago

/triage accepted

k8s-triage-robot commented 1 week ago

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted