[root@node01 ~]# etcdctl \
> --endpoint=https://10.0.20.31:2379 \
> --ca-file=/etc/kubernetes/cert/ca.pem \
> --cert-file=/etc/etcd/cert/etcd.pem \
> --key-file=/etc/etcd/cert/etcd-key.pem cluster-health
member 20efe5e6128d9e63 is healthy: got healthy result from https://10.0.20.31:2379
member 7cf960cbc106b63f is healthy: got healthy result from https://10.0.20.33:2379
member e6886445a833720c is healthy: got healthy result from https://10.0.20.32:2379
cluster is healthy
[root@node01 ~]# systemctl status etcd
● etcd.service - Etcd Server
Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2019-11-29 12:18:52 CST; 25min ago
Docs: https://github.com/coreos
Main PID: 1342 (etcd)
CGroup: /system.slice/etcd.service
└─1342 /opt/k8s/bin/etcd --data-dir=/data/k8s/etcd/data --wal-dir=/data/k8s/etcd/wal --name=node01 --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem -...
Nov 29 12:18:54 node01.tracy.com etcd[1342]: rejected connection from "10.0.20.31:39976" (error "EOF", ServerName "")
Nov 29 12:18:54 node01.tracy.com etcd[1342]: rejected connection from "10.0.20.31:39986" (error "EOF", ServerName "")
2、apiserver自身报错
在确保了etcd集群正常的情况,message报错如下:
Nov 29 12:18:55 node01 kube-apiserver: W1129 12:18:55.730235 1339 asm_amd64.s:1337] Failed to dial 10.0.20.32:2379: grpc: the connection is closing; please retry.
Nov 29 12:18:55 node01 kube-apiserver: W1129 12:18:55.730298 1339 asm_amd64.s:1337] Failed to dial 10.0.20.33:2379: grpc: the connection is closing; please retry.
一直会有这样的报错,等一会儿后日志如下:
Nov 29 10:50:47 node01 kube-apiserver: I1129 10:50:47.741463 13902 storage_rbac.go:284] created rolebinding.rbac.authorization.k8s.io/system::leader-locking-kube-scheduler in kube-system
Nov 29 10:50:47 node01 kube-apiserver: I1129 10:50:47.781732 13902 storage_rbac.go:284] created rolebinding.rbac.authorization.k8s.io/system:controller:bootstrap-signer in kube-system
Nov 29 10:50:47 node01 kube-apiserver: I1129 10:50:47.821448 13902 storage_rbac.go:284] created rolebinding.rbac.authorization.k8s.io/system:controller:cloud-provider in kube-system
Nov 29 10:50:47 node01 kube-apiserver: I1129 10:50:47.861600 13902 storage_rbac.go:284] created rolebinding.rbac.authorization.k8s.io/system:controller:token-cleaner in kube-system
Nov 29 10:50:47 node01 kube-apiserver: I1129 10:50:47.901997 13902 storage_rbac.go:284] created rolebinding.rbac.authorization.k8s.io/system:controller:bootstrap-signer in kube-public
Nov 29 10:50:47 node01 kube-apiserver: W1129 10:50:47.985238 13902 lease.go:222] Resetting endpoints for master service "kubernetes" to [10.0.20.31]
Nov 29 10:50:47 node01 kube-apiserver: I1129 10:50:47.985971 13902 controller.go:606] quota admission added evaluator for: endpoints
最后这里日志 看着 apiserver又是正常的了
apiserver看起来正常,但实际不行,操作如下:
[root@node01 ~]# kubectl get cs
Error from server (BadRequest): the server rejected our request for an unknown reason
组件版本
k8s版本::v1. 14.2 etcd版本:3.3.11
集群:
各组件配置文件
ETCD 配置文件
etcd集群状态
flanneld 配置文件
apiserver 配置文件
nginx 配置文件
nginx监听地址使用
127.0.0.1
也测试过问题现象
1、在启动apiserver前
在启动apiserver前都是正常的,但是配置好apiserver后
etcd 查看状态就不正常 就开始报错:
2、apiserver自身报错
在确保了etcd集群正常的情况,message报错如下:
一直会有这样的报错,等一会儿后日志如下:
最后这里日志 看着 apiserver又是正常的了
apiserver看起来正常,但实际不行,操作如下:
我的测试
1、降级etcd 3.2.x 问题依旧
2、降级内核 4.18 问题依旧
3、修改config,直接连接自己的apiserver,问题依旧
还请麻烦您帮忙看看问题在哪里。。谢谢
有可能是我某些看似正常的操作 不正常,但目前已知没找到问题