kubernetes-sigs / kubespray

Deploy a Production Ready Kubernetes Cluster
Apache License 2.0
16.2k stars 6.48k forks source link

etcd: throttle restart for availability #11677

Closed VannTen closed 2 weeks ago

VannTen commented 3 weeks ago

What type of PR is this? /kind bug

What this PR does / why we need it: During upgrade, etcd member are restarted all at once. This can impact the availability of the etcd cluster and subsequently of the Kubernetes cluster.

Limit the concurrent restart so that the etcd cluster can keep quorum.

Which issue(s) this PR fixes: Fixes #11645

Does this PR introduce a user-facing change?:

HA etcd cluster keeps quorum during upgrades.
k8s-ci-robot commented 3 weeks ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: VannTen

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubernetes-sigs/kubespray/blob/master/OWNERS)~~ [VannTen] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
VannTen commented 3 weeks ago

/ok-to-test

yankay commented 2 weeks ago

Thanks @VannTen /lgtm