Open fabriziopandini opened 5 years ago
/cc
We have to be careful, but we certainly need to act upon it. The plan is that from etcd 3.5 new members will be joined only as learners. In etcd 3.5 it will be possible to use a learner node for reading, but still the problem with writing continues. And, as LBs are out of the scope of kubeadm, things might become a bit difficult. We probably need to direct API servers to healthy leaders and possibly do that via an etcd LB. Another possibility is to not expose the API servers, that have a local learner etcd node from the API LB (not sure if this would actually work though). In short, we need to experiment a bit with this to find what's viable and easy for use.
/assign
/cc
Some context on this: https://github.com/etcd-io/etcd/pull/11640, we might want to wait for an etcd version that includes this patch.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale
/remove-lifecycle stale
/cc
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
Since Kubernetes 1.23, we are using etcd 3.5.0 now. https://etcd.io/docs/v3.5/learning/design-learner/ This seems to be a nice candidate for 1.24.
What is the suggested way to use the learner mode?
And I see in the etcd future plan, https://github.com/etcd-io/etcd/pull/10887 a pull request that may add the ability to auto promote the learner. (Will we wait for the auto promote feature?)
I am not up to date on the learner mode support in etcd but auto promotion sounds better. The change in kubeadm will need a new KEP.
https://github.com/kubernetes/enhancements/tree/master/keps On Dec 14, 2021 12:06, "Paco Xu" @.***> wrote:
What is the suggested way to use the learner mode?
- join member as a learner
- wait for data aligning and then promote it to a member (kubeadm needs to wait for it to happen.)
And I see in the etcd future plan, etcd-io/etcd#10887 https://github.com/etcd-io/etcd/pull/10887 a pull request that may add the ability to auto promote the learner. (Will we wait for the auto promote feature?)
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubeadm/issues/1793#issuecomment-993374842, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACRATBNCQAJOUZ4Z3TBJYDUQ4JI7ANCNFSM4IXSYHIQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
we had a discussion about this today with @fabriziopandini so learner mode was used in Talos https://www.talos.dev/v0.12/introduction/what-is-new/#etcd and we might want to have it in kubeadm.
i can start writing a doc / KEP for this but if someone will like to work on the code / unit tests / e2e tests that would be great.
@fabriziopandini @pacoxu and others watching the ticket.
i've created a hack MD draft for the learner mode proposal: https://hackmd.io/@DAKGcrh_RpC5vlt8w5bf8A/r1qoLh9zj
comments are welcome. we can notify kubespray and Cluster API for feedback too.
once we have some agreement, i can PR the KEP in a raw / draft stage and keep it provisional. if we get a contributor who wants to work on this we can start tracking the feature for a kubeadm / k8s release milestone. (e.g.. this work is not planned for 1.26 unless we get a contributor to sign up)
Thx! I've added it to the ClusterAPI office hours agenda for today.
I'll also try to take a look, but could take a bit, just not enough bandwith right now.
updated link https://hackmd.io/@DAKGcrh_RpC5vlt8w5bf8A/r1qoLh9zj
i can PR the proposal around end of next week in k/enhancements
The KEP looks good. For the detail steps according to etcd docs.
member add --learner
.canPromote==true
: call member promote
API. There would be a new timeout here.My only question is that:
BTW, there is no new learner mode related bug in etcd repo.(I am not sure if there is no enough feature users in etcd community or the feature is very stable.)
Can we skip step 2 above so that we use an etcd learner as a standby node? I mean we don't promote it and leave it to users. Users can promote it in case the master has some problems. Or keep it as a learner/standby node for failover. Should we add a flag for it?
my vote would be to preserve the current behavior and not a have a flag - i.e. always try to promote.
BTW, there is no new learner mode related bug in etcd repo.(I am not sure if there is no enough feature users in etcd community or the feature is very stable.)
that may be a concern, but the Talos project is using it too and maybe it just works fine.
we can ask etcd maintainers later.
KEP PR is here: https://github.com/kubernetes/enhancements/pull/3615
k/e tracking issue: https://github.com/kubernetes/enhancements/issues/3614
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The alpha-level code is merged for v1.27 in https://github.com/kubernetes/kubernetes/pull/113318
How was the issue with the apiserver failing when requesting against a learner node addressed? https://github.com/kubernetes/kubeadm/issues/1793#issuecomment-532630048 https://github.com/etcd-io/etcd/issues/12789
As of etcd 3.5.6 and kubernetes 1.23.14 write requests will fail when it hits a learner node:
etcdserver: rpc not supported for learner
Is this handled in newer k8s-apiserver versions?
Is this handled in newer k8s-apiserver versions?
@juliantaylor TMK, no. also this is the first time i see the issue. EDIT: NVM, i recall the k/kubeadm comment, but given some k8s distros exclusively use learners nowadays, this should no longer be an issue
^ @pacoxu @ahrtr
I think there is an chicken-egg issue with the implementation of https://github.com/kubernetes/kubernetes/pull/113318.
While adding a new member as learner the new member will be promoted directly. Also the promotion will wait to succeed. The problem here is that the static pod manifest for the etcd will be written only as soon as the promotion was successful. But there will never be a etcd container running w/o the manifest. See cmd/kubeadm/app/phases/etcd/local.go#L152-L166.
Or am I missing something here?
Edit:
Adding some logs of the failed promotion
I0112 15:44:34.662336 8944 etcd.go:123] update etcd endpoints: https://10.6.0.167:2379
I0112 15:44:34.662480 8944 local.go:151] [etcd] Adding etcd member: https://10.6.0.133:2380
I0112 15:44:34.668641 8944 etcd.go:394] [etcd] Adding etcd member as learner: 68747470733a2f2f31302e362e302e3133333a32333830
I0112 15:44:34.674268 8944 etcd.go:463] [etcd] Promoting a learner as a voting member: 5747ee42cd477756
{"level":"warn","ts":"2023-01-12T15:44:34.674Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0007e6c40/10.6.0.167:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
I0112 15:44:34.674985 8944 etcd.go:475] [etcd] Promoting the learner 5747ee42cd477756 failed: etcdserver: can only promote a learner member which is in sync with leader
I0112 15:44:34.799658 8944 etcd.go:463] [etcd] Promoting a learner as a voting member: 5747ee42cd477756
{"level":"warn","ts":"2023-01-12T15:44:34.800Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0007e7180/10.6.0.167:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
I0112 15:44:34.800447 8944 etcd.go:475] [etcd] Promoting the learner 5747ee42cd477756 failed: etcdserver: can only promote a learner member which is in sync with leader
I added a WIP PR with a tested implementation. With this PR the static pod manifests will be created before the member is added to the cluster.
@juliantaylor TMK, no. also this is the first time i see the issue.
Basically there is no any change on the logic in etcdserver in 3.5,
So the error message below should NOT
be a new issue in etcd 3.5.x.
As of etcd 3.5.6 and kubernetes 1.23.14 write requests will fail when it hits a learner node:
etcdserver: rpc not supported for learner
I only tested it in my clusters and maybe the etcd manifest is left before I rejoin. That is why I did not meet this bug.
Logically, this would be a problem. Thanks for the fix and I will test and review it later.
@juliantaylor TMK, no. also this is the first time i see the issue.
Basically there is no any change on the logic in etcdserver in 3.5,
* [interceptor.go#L53-L55](https://github.com/etcd-io/etcd/blob/715a0047faba060577841b13c87e9b6a1269eaa0/server/etcdserver/api/v3rpc/interceptor.go#L53-L55) * [interceptor.go#L222-L224](https://github.com/etcd-io/etcd/blob/715a0047faba060577841b13c87e9b6a1269eaa0/server/etcdserver/api/v3rpc/interceptor.go#L222-L224)
So the error message below should
NOT
be a new issue in etcd 3.5.x.As of etcd 3.5.6 and kubernetes 1.23.14 write requests will fail when it hits a learner node: etcdserver: rpc not supported for learner
yes it is not a new problem it exists since the learner mode exists. It does imply you need to reconfigure and restart the apiservers every time before an etcd instance is restarted (or after one is added), if that is handled by kubeadm there is no issue.
ah restarting is not a problem as it does not put the node back into learning mode. So from kubeadm's perspective this everything should be ok if it adds the node to the apiserver after promotion.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
Looks like this has been implemented, but should stay open until this is graduated.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
Updated the issue description for the implementation history:
The CI https://testgrid.k8s.io/sig-cluster-lifecycle-kubeadm#kubeadm-kinder-learner-mode-latest. This feature is alpha since 1.27. I wondered if we could promote it to beta in the v1.29 release cycle.
Is there anything further that we should do to promote it? I opened https://github.com/kubernetes/kubernetes/pull/120228.
Updated the issue description for the implementation history:
The CI https://testgrid.k8s.io/sig-cluster-lifecycle-kubeadm#kubeadm-kinder-learner-mode-latest. This feature is alpha since 1.27. I wondered if we could promote it to beta in the v1.29 release cycle.
Is there anything further that we should do to promote it? I opened kubernetes/kubernetes#120228.
@tobiasgiese : maybe you could provide some feedback, I think you folks still use it right?
maybe you could provide some feedback, I think you folks still use it right?
We (Mercedes-Benz) are using it already, yes. Also we have backported it to v1.2[4-6] (since https://github.com/kubernetes/kubernetes/pull/115038) and it is working quite well. We have never had any problems with the learner mode and we have alot of nightly builds (about 50 periodic nightly builds and 40 Prow trigger builds/jobs).
great feedback. thank you @tobiasgiese
maybe you could provide some feedback, I think you folks still use it right?
We (Mercedes-Benz) are using it already, yes. Also we have backported it to v1.2[4-6] (since kubernetes/kubernetes#115038) and it is working quite well. We have never had any problems with the learner mode and we have alot of nightly builds (about 50 periodic nightly builds and 40 Prow trigger builds/jobs).
I suppose that we should graduate this feature later in v1.31+ and get more feedback before GA.
So no action item for v1.30.
Growing a local etcd cluster is a complex operation, and in the past, we already faced some issues like e.g. https://github.com/kubernetes-sigs/kind/issues/588
Now that the implementation of the etcd learner mode is progressing, we should start considering if to use it in kubeadm in order to make join --control-plane implementation more robust.
at a high level what we would like to achieve is:
Ref docs:
(edit by neolit123)
1.26:
1.27(alpha):
1.29(beta):
1.32(GA):
1.33: