Open erictcgs opened 5 years ago
Note that this also means nodes can't be added to the cluster - that requires the install playbook to run, must include etcd nodes (so that primary master has that variable set correctly), primary_master (to get kubeadm install token), then the nodes, however the playbook fails when trying to install packages on the master, then can't generate kubeadm token
It looks like the root of my original comment was a kubernetes_cni_version: "0.6.0-00"
variable set in the ansible inventory from a previous version of kubernetes. This seems to be ignored in the upgrade scripts so those had worked and installed cni 0.7.5, but in the install scripts use that variable and it caused failure.
Unfortunately playbook still can't be run - running into this issue when adding a new node: https://github.com/kubernetes/kubeadm/issues/907:
# /usr/bin/kubeadm join api.hostname.com:6443 --token=6uouog.xxxx --discovery-token-unsafe-skip-ca-verification --ignore-preflight-errors=all
...
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.12" ConfigMap in the kube-system namespace
configmaps "kubelet-config-1.12" is forbidden: User "system:bootstrap:6uouog" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
From github this has been due to version mismatch, but here everything was installed/upgraded via wardroom and versions seem to match.
On master:
# apt list --installed | grep kuber
cri-tools/kubernetes-xenial,now 1.12.0-00 amd64 [installed,automatic]
kubeadm/kubernetes-xenial,now 1.12.7-00 amd64 [installed,upgradable to: 1.14.3-00]
kubectl/kubernetes-xenial,now 1.12.7-00 amd64 [installed,upgradable to: 1.14.3-00]
kubelet/kubernetes-xenial,now 1.12.7-00 amd64 [installed,upgradable to: 1.14.3-00]
kubernetes-cni/kubernetes-xenial,now 0.7.5-00 amd64 [installed]
# kubelet --version
Kubernetes v1.12.7
# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.7", GitCommit:"6f482974b76db3f1e0f5d24605a9d1d38fad9a2b", GitTreeState:"clean", BuildDate:"2019-03-25T02:49:02Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}
On new node:
# apt list --installed | grep kuber
cri-tools/kubernetes-xenial,now 1.12.0-00 amd64 [installed,automatic]
kubeadm/kubernetes-xenial,now 1.12.7-00 amd64 [installed,upgradable to: 1.14.3-00]
kubectl/kubernetes-xenial,now 1.12.7-00 amd64 [installed,upgradable to: 1.14.3-00]
kubelet/kubernetes-xenial,now 1.12.7-00 amd64 [installed,upgradable to: 1.14.3-00]
kubernetes-cni/kubernetes-xenial,now 0.7.5-00 amd64 [installed]
# kubelet --version
Kubernetes v1.12.7
# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.7", GitCommit:"6f482974b76db3f1e0f5d24605a9d1d38fad9a2b", GitTreeState:"clean", BuildDate:"2019-03-25T02:49:02Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}
What is the state of the scoped token you are trying to use during this run? Are you sure that it has not expired?
I'm using the token generated by wardroom on the master - it fails during wardroom node install, if I do kubeadm token list
immediately on the master it is listed and seems valid, and if I do kubeadm join
on the node manually with it (all within a minute or so of the initial ansible run) it fails with the same error that wardroom got.
Is there a role/rolebinding being misconfigured that's supposed to allow group system:bootstrappers:kubeadm:default-node-token
to access those configmaps?
On the master:
root@arcadebackup-clus8-master1-9c250a:~# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
kghrxz.jtfd32hksthjp7m9 23h 2019-09-04T14:57:01-04:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
On the node:
$ /usr/bin/kubeadm join api.c8....:6443 --token=kghrxz.jtfd32hksthjp7m9 --discovery-token-unsafe-skip-ca-verification --ignore-preflight-errors=all
[preflight] Running pre-flight checks
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.09.2. Latest validated version: 18.06
[discovery] Trying to connect to API Server "api.c8....:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://api.c8....:6443"
[discovery] Cluster info signature and contents are valid and no TLS pinning was specified, will use API Server "api.c8....:6443"
[discovery] Successfully established connection with API Server "api.c8....:6443"
[join] Reading configuration from the cluster...
[join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
unable to fetch the kubeadm-config ConfigMap: failed to get config map: configmaps "kubeadm-config" is forbidden: User "system:bootstrap:kghrxz" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
/kind bug
What steps did you take and what happened:
Ran upgrade script from 1.11.6 cluster to 1.12.7, masters failed due to temporary api server inavailability, ansible aborted. kubectl get nodes showed that masters were successfully upgraded, tried to re-run script to make sure all plays were performed, script now fails on package install
What did you expect to happen:
Detect no change necessary on masters for stages that were successful, only apply needed changes
Anything else you would like to add:
Environment:
branch
1.12/etc/os-release
): ubuntu 18.04@craigtracey