kubernetes / kubeadm

Aggregator for issues filed against kubeadm
Apache License 2.0
3.77k stars 717 forks source link

Add kubelet instance configuration to configure CRI socket for each node #3042

Open HirazawaUi opened 8 months ago

HirazawaUi commented 8 months ago

Is this a BUG REPORT or FEATURE REQUEST?

FEATURE REQUEST

Anything else we need to know?

This issue wants to remove container runtime interface (CRI) socket annotations from Node objects in Kubernetes and no longer added --container-runtime-endpoint args in kubeadm-flags.env ( --container-runtime-endpoint args deprecated in kubelet), this annotation and args are used to specify the CRI socket endpoint on each node that the kubelet uses to communicate with the container runtime.

Add kubelet-instance-config.yml locally to store container runtime interface (CRI) sockets, and in kubeadm init, join and upgrade, override the global kubelet configuration through kubelet-instance-config.yml.

More detailed design at: https://hackmd.io/@r2Yq-PVoRcK2OaGypnwPMg/rJFChACCp/edit

### Tasks

edit by neolit123:

1.32 alpha

HirazawaUi commented 8 months ago

/cc @neolit123 @pacoxu

neolit123 commented 8 months ago

@HirazawaUi we need a KEP for this, ket's see what @pacoxu can comment about the previous work done on this.

@pacoxu can you please post links to your previous KEP work.? also didn't we already have a tracking issue here in k/kubeadm?

maybe we can close KEP PRs and let @HirazawaUi to take over.

neolit123 commented 8 months ago

also are you willing to work on this @HirazawaUi there are a lot of things to update around this change across multiple releases. and i don't think me or anyone else will have a lot of time for it.

too many questions

HirazawaUi commented 8 months ago

also are you willing to work on this @HirazawaUi there are a lot of things to update around this change across multiple releases. and i don't think me or anyone else will have a lot of time for it.

  • KEP design how is it going to work exactly, do we need a feature gate
  • k8s.io docs update
  • k/k code update
  • do we need a e2e test in k/kubeadm

too many questions

Yes, I'll try to finish it.

HirazawaUi commented 7 months ago

@pacoxu What do you think about this? I would love to hear your comments :)

pacoxu commented 7 months ago

The KEP

My current proposal would be something like https://github.com/kubernetes/enhancements/pull/3930#issuecomment-1931657825

i meant a generic patches approach where a local config file overrides the global config downloaded from the cm. not the --patches feature.

this local file must be stored somewhere. perhaps in the same dir as config.yaml, but called config-instance.yaml. IMO it has a number of tricky aspects that need to be covered in the design doc for init, join, upgrade.

Above is @neolit123 your proposal in comment here https://github.com/kubernetes/enhancements/pull/3930/files#r1177682829.

I prefer this solution.

BTW, /var/lib/kubelet/kubeadm-flags.env will only have --container-runtime-endpoint later. I would like to make this cri-socket-annotation and runtime endpoint configuration to config-instance.yaml under /var/lib/kubelet.

HirazawaUi commented 6 months ago

Thanks, I have missed this comment, I will resurrect kep as soon as possible.

neolit123 commented 5 months ago

note if this ever becomes part of kubeadm, we can close this ticket: https://github.com/kubernetes/kubeadm/issues/1924

k8s-triage-robot commented 2 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

neolit123 commented 2 months ago

/remove-lifecycle stale

neolit123 commented 1 month ago

@HirazawaUi i added some TODOs a tracking in the issue description here. please keep it updated with links - ....

HirazawaUi commented 1 month ago

@HirazawaUi i added some TODOs a tracking in the issue description here. please keep it updated with links - ....

OK, thank you so much for your help.

neolit123 commented 4 days ago

@HirazawaUi @pacoxu

i think i found one problem wih the feature gate. https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/phases/upgrade/postupgrade.go#L138

here we read the value from the env file, but we also need to migrate users away from using the flag --container-runtime for the kubelet. so on upgrade we need to mutate the env file to no longer have the --container-runtime flag. the problem here is that if the kubelet receives the flag --container-runtime then it will take precedence over the value passed from the kubelet config.yaml and the config yaml will be ignored.

this is actually something we might have to backport as a fix for 1.32. in the alpha.

we have this utility for writing the env file. https://github.com/kubernetes/kubernetes/blob/95d71c464a6a5f81ae36bddc86972b3b2a1c40a5/cmd/kubeadm/app/phases/kubelet/flags.go#L65

as far as i can tell we are not doing this env file mutation on upgrade, but please correct me if i am wrong.

pacoxu commented 4 days ago

Probably, we can keep it as is in v1.32, as the value in env file and instance file are the same.

neolit123 commented 4 days ago

i think i agree to not backport the fix for this alpha feature. i guess the problem is that as long as --container-runtime is in the env file it means the feature gate doesn't work. for new clusters however init/join would not have the --container-runtime flag in the env file i think @HirazawaUi to confirm.

HirazawaUi commented 3 days ago

here we read the value from the env file, but we also need to migrate users away from using the flag --container-runtime for the kubelet. so on upgrade we need to mutate the env file to no longer have the --container-runtime flag. the problem here is that if the kubelet receives the flag --container-runtime then it will take precedence over the value passed from the kubelet config.yaml and the config yaml will be ignored.

Yes, that is indeed a problem.

for new clusters however init/join would not have the --container-runtime flag in the env file i think @HirazawaUi to confirm.

That's right.

Given that we have never mentioned this (modifying an existing env file) in the KEP, I agree with @pacoxu's view that we should keep it as is in 1.32 and move it to beta in 1.34, WDYT?

I will update the KEP later to add this description and revise it according to the implemented PR (I had planned to do this last week, but it was delayed due to being reinfected with COVID-19).

neolit123 commented 3 days ago

1.32 and move it to beta in 1.34, WDYT?

1.34 SGTM, but this issue is something we can fix in 1.33 as well and still keep it alpha. we can then check in the e2e test the flag is correctly removed.

I will update the KEP later to add this description and revise it according to the implemented PR (I had planned to do this last week, but it was delayed due to being reinfected with COVID-19).

ok, take care of your self.