Closed lixiang2017 closed 3 years ago
这里的 slave
节点指的是普通工作节点吗?
还有版本是 1.7.5
,还是 1.17.5
?
slave 节点开启了 kubelet 证书自动轮换了吗?
这里的
slave
节点指的是普通工作节点吗? 是普通的工作节点 还有版本是1.7.5
,还是1.17.5
? 版本是v1.7.5
[root@s4 ~]# kubectl version
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.5", GitCommit:"17d7182a7ccbb167074be7a87f0a68bd00d58d97", GitTreeState:"clean", BuildDate:"2017-08-31T09:14:02Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
刚看了下v1.7.5 应该是默认没有开启证书自动轮换。因为我在 /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
没找到 下面参数。
Environment="KUBELET_EXTRA_ARGS=--feature-gates=RotateKubeletServerCertificate=true"
但是我把这个参数再加进去
[root@s4 ~]# systemctl daemon-reload
[root@s4 ~]# systemctl restart kubelet
s4(slave)就变成 NotReady 了。
刚看了下v1.7.5 应该是默认没有开启证书自动轮换。因为我在
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
没找到 下面参数。Environment="KUBELET_EXTRA_ARGS=--feature-gates=RotateKubeletServerCertificate=true"
但是我把这个参数再加进去
[root@s4 ~]# systemctl daemon-reload [root@s4 ~]# systemctl restart kubelet
s4(slave)就变成 NotReady 了。
额,1.7.x 版本是不支持 kubelet 证书自动轮换点,证书轮回功能是 1.8.0 才加入的
三个 master, 一个 slave。
每个master都执行过./update-kubeadm-cert.sh master
,都可以直接更新。
salve 执行.
kubeadm1.7 join --token 27c516.e6ef88e9f68908e1 master_ip:6443 --node-name <node_name> --skip-preflight-checks
会报错的吧,因为本来该节点已经加入 K8s 集群了
1.7.x 版本的 bootstrap token 不会过期(--token 27c516.e6ef88e9f68908e1,就是这个) bootstrap token 是一个底权限的认证 token,提供给 kubelet 向 master 申请证书做认证用的,当 kubelet 启动时没发现有证书,会用 bootstrap token 向 master 申请,我觉得单纯把 kubelet 的 kubeconfig 文件删掉,重启 kubelet,让 kubelet 重现申请证书就好了 https://kubernetes.io/docs/reference/access-authn-authz/bootstrap-tokens/
好奇问一下,现在生产还在使用 1.7.5 版本吗,应该是三四年前的版本了
没报错,成功了。因为先把slave的/etc/kubernetes/kubelet.conf
删掉了,后来又重新生成了。下面是log.
[root@s4 ~]# kubeadm1.7 join --token 27c516.e6ef88e9f68908e1 192.168.10.45:6443 --node-name s1.yuhuatai-bdmd.com --skip-preflight-checks
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[preflight] Skipping pre-flight checks
[discovery] Trying to connect to API Server "192.168.10.45:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.10.45:6443"
[discovery] Cluster info signature and contents are valid, will use API Server "https://192.168.10.45:6443"
[discovery] Successfully established connection with API Server "192.168.10.45:6443"
[bootstrap] Detected server version: v1.7.5
[bootstrap] The server supports the Certificates API (certificates.k8s.io/v1beta1)
[csr] Created API client to obtain unique certificate for this node, generating keys and certificate signing request
[csr] Received signed certificate from the API server, generating KubeConfig...
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
Node join complete:
* Certificate signing request sent to master and response
received.
* Kubelet informed of new secure connection details.
Run 'kubectl get nodes' on the master to see this machine join.
好奇问一下,现在生产还在使用 1.7.5 版本吗,应该是三四年前的版本了
是个老项目,项目还在维护中,的确是四年前就用了。
root@s1.yuhuatai-bdmd.com:~# kubectl get nodes
NAME STATUS AGE VERSION
s1.yuhuatai-bdmd.com Ready 4y v1.7.5
s2.yuhuatai-bdmd.com Ready 4y v1.7.5
s3.yuhuatai-bdmd.com Ready 4y v1.7.5
s4.yuhuatai-bdmd.com Ready 4y v1.7.5
新环境基本上都是1.15、1.16 或 1.20 了。
没报错,成功了。因为先把slave的
/etc/kubernetes/kubelet.conf
删掉了,后来又重新生成了。下面是log.[root@s4 ~]# kubeadm1.7 join --token 27c516.e6ef88e9f68908e1 192.168.10.45:6443 --node-name s1.yuhuatai-bdmd.com --skip-preflight-checks [kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters. [preflight] Skipping pre-flight checks [discovery] Trying to connect to API Server "192.168.10.45:6443" [discovery] Created cluster-info discovery client, requesting info from "https://192.168.10.45:6443" [discovery] Cluster info signature and contents are valid, will use API Server "https://192.168.10.45:6443" [discovery] Successfully established connection with API Server "192.168.10.45:6443" [bootstrap] Detected server version: v1.7.5 [bootstrap] The server supports the Certificates API (certificates.k8s.io/v1beta1) [csr] Created API client to obtain unique certificate for this node, generating keys and certificate signing request [csr] Received signed certificate from the API server, generating KubeConfig... [kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf" Node join complete: * Certificate signing request sent to master and response received. * Kubelet informed of new secure connection details. Run 'kubectl get nodes' on the master to see this machine join.
看日志应该是 kubeadm 帮申请的证书,我记得现在的版本应该是 kubelet 自己申请的证书,没看过 1.7.x 的源码。。。
1.7.x 版本的 bootstrap token 不会过期(--token 27c516.e6ef88e9f68908e1,就是这个) bootstrap token 是一个底权限的认证 token,提供给 kubelet 向 master 申请证书做认证用的,当 kubelet 启动时没发现有证书,会用 bootstrap token 向 master 申请,我觉得单纯把 kubelet 的 kubeconfig 文件删掉,重启 kubelet,让 kubelet 重现申请证书就好了 https://kubernetes.io/docs/reference/access-authn-authz/bootstrap-tokens/
刚试了下,不行。token没问题,是这个。应该是slave中找不到 token。因为我看slave中的/etc/kubernetes/kubelet.conf
结构跟 master 不太一样。slave中的/etc/kubernetes/kubelet.conf
会多有 tls-bootstrap-token-user
[root@s4 ~]# cat /etc/kubernetes/kubelet.conf
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: a_long_string1
server: https://192.168.10.45:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: kubelet-csr
name: kubelet-csr
- context:
cluster: kubernetes
user: tls-bootstrap-token-user
name: tls-bootstrap-token-user@kubernetes
current-context: kubelet-csr
kind: Config
preferences: {}
users:
- name: kubelet-csr
user:
client-certificate-data: a_long_string2
client-key-data: a_long_string3
- name: tls-bootstrap-token-user
user:
token: 27c516.e6ef88e9f68908e1
我又去掉这个文件里的一些 certificate 数据,尝试保留tls-bootstrap-token-user
让他重新生成,不过还是失败了。
1.7.x 版本的 bootstrap token 不会过期(--token 27c516.e6ef88e9f68908e1,就是这个) bootstrap token 是一个底权限的认证 token,提供给 kubelet 向 master 申请证书做认证用的,当 kubelet 启动时没发现有证书,会用 bootstrap token 向 master 申请,我觉得单纯把 kubelet 的 kubeconfig 文件删掉,重启 kubelet,让 kubelet 重现申请证书就好了 https://kubernetes.io/docs/reference/access-authn-authz/bootstrap-tokens/
刚试了下,不行。token没问题,是这个。应该是slave中找不到 token。因为我看slave中的
/etc/kubernetes/kubelet.conf
结构跟 master 不太一样。slave中的/etc/kubernetes/kubelet.conf
会多有tls-bootstrap-token-user
[root@s4 ~]# cat /etc/kubernetes/kubelet.conf apiVersion: v1 clusters: - cluster: certificate-authority-data: a_long_string1 server: https://192.168.10.45:6443 name: kubernetes contexts: - context: cluster: kubernetes user: kubelet-csr name: kubelet-csr - context: cluster: kubernetes user: tls-bootstrap-token-user name: tls-bootstrap-token-user@kubernetes current-context: kubelet-csr kind: Config preferences: {} users: - name: kubelet-csr user: client-certificate-data: a_long_string2 client-key-data: a_long_string3 - name: tls-bootstrap-token-user user: token: 27c516.e6ef88e9f68908e1
我又去掉这个文件里的一些 certificate 数据,尝试保留
tls-bootstrap-token-user
让他重新生成,不过还是失败了。
应该是 1.7.x 的机制问题,从你上面的 join 日志来看,1.7.x 应该是 kubeadm 负责申请的证书,并不是 kubelet 来申请,毕竟这个版本的 kubelet 证书轮换功能还是 alpha 版,默认不开启的,所以把 bootstrap-token 给 kubelet 自己申请证书是行不通的
现在用你的 kubeadm join 方法能用就不纠结来,比较这个版本太古老了,研究太深没啥意义
另外,一劳永逸解决这个问题目前看来有两个方法:
低版本(v1.75)更新slave证书。