easzlab / kubeasz

使用Ansible脚本安装K8S集群,介绍组件交互原理,方便直接,不受国内网络环境影响
https://github.com/easzlab/kubeasz
10.53k stars 3.53k forks source link

安装2个 master ,2 个node的集群,etcd安装在master1和两个node上,当master1宕机后,集群不可用,而 master2宕机后,集群可用; #1340

Closed fmonkeyaksd closed 8 months ago

fmonkeyaksd commented 11 months ago

What happened? 发生了什么问题?

master1宕机后, systemctl status kube-apiserver报错: addConn.createTransport failed to connect to {https://10.4.29.213:2379}

What did you expect to happen? 期望的结果是什么?

两个master任意一个宕机,集群均正常。

How can we reproduce it (as minimally and precisely as possible)? 尽可能最小化、精确地描述如何复现问题

2个 master ,2 个node的集群,etcd安装在master1和两个node上,当master1宕机后,集群不可用,而 master宕机后,集群可用

Anything else we need to know? 其他需要说明的情况

No response

Kubernetes version k8s 版本

1.17.2

Kubeasz version

2.0.0

OS version 操作系统版本

```console # On Linux: $ cat /etc/os-release NAME="Red Hat Enterprise Linux Server" VERSION="7.3(Ma·po)" ID="rhel" ID LIKE="fedora" VERSION ID="7.3" PRETTY NAME="RedHat Enterprise Linux Server 7.3 (Maipo)" ANSI COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:7.3:GA:server"HOME URL="https://www.redhat.com/ BUG REPORT URL="https://bugzilla.redhat.com/" REDHAT BUGZILLA PRODUCT="Red Hat Enterprise Linux 7REDHAT BUGZILLA PRODUCT_VERSION=7.3 REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux" REDHAT SUPPORT PRODUCT_VERSION=7 $ uname -a Linux 12922133.10.0-514.el7.x86_64 #1 SMP Wed Oct 19 11:24:13 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux

Related plugins (CNI, CSI, ...) and versions (if applicable) 其他网络插件等需要说明的情况

gjmzj commented 11 months ago

部署节点是master1吗?你可以试试把部署节点独立出来;因为master1宕机,是不是部署节点没有了,以为集群不可用了

fmonkeyaksd commented 10 months ago

我定位问题原因是,在集群安装过程中,计算节点的/etc/kubelet.kubeconfig和/etc/kube-proxy.kubeconfig里配置的apiserver是集群默认的第一个master节点:6443端口,这就导致当第一个master挂掉后,计算节点无法连接apiserver

im-jinxinwang commented 10 months ago

@fmonkeyaksd 两个etcd? 两个etcd没有高可用性。

github-actions[bot] commented 9 months ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 8 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.