Closed zerkms closed 5 years ago
@zerkms i remember this problem, but i could not reproduce it: https://github.com/kubernetes/kubeadm/issues/1757#issuecomment-526370377
we should have re-open the old issue instead of this new one.
As you can see unless you specify it explicitly - the os.Hostname is used, and the hostname of the machine is hq-srv11:
the join configuration is not used during kubeadm upgrade. where are you seeing a call to SetJoinDynamicDefaults
from kubeadm upgrade apply
?
but i could not reproduce it:
I provided more details in this request - it happens when you provide a kubeadm config.
where are you seeing a call to SetJoinDynamicDefaults from kubeadm upgrade apply?
It's really hard to find it just reading the code, but from the logs I found that at least at
func PerformPostUpgradeTasks(client clientset.Interface, cfg *kubeadmapi.InitConfiguration, newK8sVer *version.Version, dryRun bool) error {
errs := []error{}
// Upload currently used configuration to the cluster
// Note: This is done right in the beginning of cluster initialization; as we might want to make other phases
// depend on centralized information from this source in the future
if err := uploadconfig.UploadConfiguration(cfg, client); err != nil {
errs = append(errs, err)
}
at UploadConfiguration
call it already uses the wrong value:
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[dryrun] Would perform action CREATE on resource "configmaps" in API group "core/v1"
[dryrun] Attached object:
apiVersion: v1
data:
ClusterConfiguration: |
apiServer:
extraArgs:
authorization-mode: Node,RBAC
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: 10.50.8.1:6443
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.16.0
networking:
dnsDomain: <redacted>
podSubnet: 10.51.0.0/16
serviceSubnet: 10.52.0.0/16
scheduler: {}
ClusterStatus: |
apiEndpoints:
hq-srv11:
advertiseAddress: 10.50.4.11
bindPort: 6443
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterStatus
kind: ConfigMap
metadata:
creationTimestamp: null
name: kubeadm-config
namespace: kube-system
at UploadConfiguration call it already uses the wrong value:
what value is wrong in this config for you?
apiEndpoints:
hq-srv11:
There is no kubernetes node hq-srv11
, there is only hq-srv11.<redacted-org-domain-name>
But I cannot reliably trace back (without a debugger) where exactly cfg.NodeRegistration.Name
is filled in.
i will try to reproduce this again tomorrow.
/cc
Thanks @zerkms I have reproduced the problem through the following steps:
# kubeadm init --node-name=foo
# kubeadm config view > config.yaml
# kubeadm upgrade apply v1.16.0 --dry-run --config=config.yaml
I'm going to dig into how do we solve this problem.
IMO, in short term, we can solve this problem by manually passing a node-name
flag ( just like what we did in init and join phase ) ( kubernetes/kubernetes#83180 ).
But in long term, we need a way to automatically associate node and node name configured within the cluster.
we need a way to automatically associate node and node name configured within the cluster.
isn't it available in the kubelet certificate?
we need a way to automatically associate node and node name configured within the cluster.
isn't it available in the kubelet certificate?
Yeah, that might be one way. We can extract the host name from the certificate. But I'm not quite sure. @zerkms @neolit123
/kind bug
/remove-priority awaiting-more-evidence
Yeah, that might be one way. We can extract the host name from the certificate.
it can work, but we shouldn't rely on certs for node names.
IMO, in short term, we can solve this problem by manually passing a node-name flag ( just like what we did in init and join phase )
the problem with that is that if the user messes up the name during upgrade
, i think this will result in a new entry in the ClusterStatus.
# kubeadm init --node-name=foo
# kubeadm config view > config.yaml
# kubeadm upgrade apply v1.16.0 --dry-run --config=config.yaml
i can try testing this later today.
if i pass --node-name=foo.bar.zzz
instead of foo
in the above example, would the config.yaml
still contain only foo
in the ClusterStatus section?
never mind this just uses the ClusterConfiguration object.
ok, so i did some investigation here.
@zerkms your suggestion to use certificates to fetch the node name is already used actually, but only when the user is not providing a config file and and the configuration is fetched from the cluster. see https://github.com/kubernetes/kubernetes/blob/2e6b073a3f800654ec217e763fcb97412308a9db/cmd/kubeadm/app/util/config/cluster.go#L113
this is like so because the dynamic defaulting of node name from certficates happens only for nodes that have the kubelet config and certificates present already and a configuration is fetched from the cluster. if you pass a configuration file kubeadm will default the node name to your hostname. this is by design.
dynamically defaulting your node name to a value from the kubelet and certificates when already passing --config
to apply
is an option, but i don't think we should do this.
the explicit flag that @SataQiu added is workaround for your use case. there is a similar flag for CRI socket. but i'm personally not in favor of adding more flags.
your existing workaround is to have such a config:
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
nodeRegistration:
name: the.fqdn.here
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
....
my question for you @zerkms is why are you passing --config
to apply
? this acts like reconfiguration and while kubeadm supports it, it should not be done in the first place. if your config is missing important information it will be defaulted with dynamic values, such as the host name of the node.
cc @rosti @fabriziopandini
@neolit123
my question for you @zerkms is why are you passing --config to apply?
nowhere in the upgrade documentation process it states whether it should be specified or not. And given that I used it initially to create a cluster I by default think that now I must specify it every time during an upgrade as well.
If it's not the case I think it may need a bit of clarification in the documentation, right?
Thanks nevertheless, now I see it's my mistake actually.
nowhere in the upgrade documentation process it states whether it should be specified or not
exactly, that is because we don't want users to use it.
the --config
flag was added to upgrade to allow reconfiguration of the existing cluster, which is now supported using the kubeadm kustomize feature (see the changelog for 1.16). yet, reconfiguring the cluster using this flag is not recommended.
If it's not the case I think it may need a bit of clarification in the documentation, right?
i agree. this needs a line or two in this document: https://github.com/kubernetes/website/blob/master/content/en/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade.md
/kind documentation /assign
Thanks @neolit123 ! I learned a lot from this ticket. So we can already set the node name through config file :
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
nodeRegistration:
name: the.fqdn.here
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
....
Emm... Adding --node-name
flag just saves you the trouble of changing the configuration file.
Since init
and join
both support this flag, why not upgrade
?
I'm curious about the bad effects of adding this flag.
And WDYT @zerkms ?
@SataQiu given that @neolit123 mentioned a kubeadm config should not be specified at all and it would all work - I think this report should be closed as works as designed
.
@SataQiu ideally no config backed flags should be added to upgrade. The existence of --config
is a workaround a specific case. However, I hope, that in the future it will be removed and a proper reconfiguration CLI should be introduced in kubeadm.
Adding kind/UX, as this is also UX issue (in the long term) as it's a documentation issue in the short. /kind UX
@rosti: The label(s) kind/ux
cannot be applied. These labels are supported: api-review, community/discussion, community/maintenance, community/question, cuj/build-train-deploy, cuj/multi-user, platform/aws, platform/azure, platform/gcp, platform/minikube, platform/other
/area UX
i've sent a couple of PRs:
This is a followup for the https://github.com/kubernetes/kubeadm/issues/1757
What keywords did you search in kubeadm issues before filing this one?
upgrade kubeadm hostname
If you have found any duplicates, you should instead reply there and close this page.
If you have not found any duplicates, delete this section and continue on.
Is this a BUG REPORT or FEATURE REQUEST?
BUG REPORT
Versions
kubeadm version (use
kubeadm version
):kubeadm version: &version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.0", GitCommit:"2bd9643cee5b3b3a5ecbd3af49d09018f0773c77", GitTreeState:"clean", BuildDate:"2019-09-18T14:34:01Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
Environment:
Kubernetes version (use
kubectl version
):Cloud provider or hardware configuration: bare metal
OS (e.g. from /etc/os-release): ubuntu:bionic
Kernel (e.g.
uname -a
):Linux hq-srv11 4.15.0-64-generic #73-Ubuntu SMP Thu Sep 12 13:16:13 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Others:
What happened?
During the
kubeadm upgrade apply v1.16.0 --config=/path/to/config.yaml --dry-run
run it ended up with an infinite loop ofand the same with more verbose output:
I traced back to see where that value comes from and found the source of the problem:
As you can see unless you specify it explicitly - the
os.Hostname
is used, and the hostname of the machine ishq-srv11
:while nodes in the cluster have the explicitly set FQDN
What you expected to happen?
I believe the name of the node should be obtained from the API, or at least correlated with what's in the API, since
hostname
not necessary matches the node name.How to reproduce it (as minimally and precisely as possible)?
Initialise an older version cluster with a node with non-default name and with a
kubeadm
confid usingkubeadm init --node-name=foo
, then upgrade, using the kubeadm config again.Anything else we need to know?