kubernetes / kubeadm

Aggregator for issues filed against kubeadm
Apache License 2.0
3.75k stars 715 forks source link

Is it the Expected Behaviour when we take 2 kubernetes master node down out of 3 #2188

Closed Sjnahak closed 4 years ago

Sjnahak commented 4 years ago

Versions

kubeadm version (use kubeadm version): v1.18.0 Kubeadm Config file:

/etc/kubernetes/kubeadm/kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: "v1.18.0"
# REPLACE with `loadbalancer` IP
controlPlaneEndpoint: "Load balancer ip:6443"
networking:
  serviceSubnet: "10.96.0.0/12"
  podSubnet: "10.244.0.0/16"

Environment:

What happened?

I had set up 3 masters with 2 worker nodes kubernetes cluster For HA using kubeadm.(Stacked topology)

When i take down 2 master nodes kubectl commands stops responding on the single running master.

Also kubeapi server stops listening on port 6443 and kubelet is not able to start any process or update node status.

But when 2 Master Node are running cluster operates fine.

What you expected to happen?

Kubeapi server should run on the single running master when other 2 master are down.

Anything else we need to know?

Please find kubelet logs and let me know if you need any other info.

Jun 17 10:17:59 "Failinghostname" kubelet: E0617 10:17:59.656953   16962 event.go:269] Unable to write event: 'Patch https://LOADBALANCER IP:6443/api/v1/namespaces/kube-system/events/kube-apiserver-"Failinghostname".dc.example.com.161949b283f69b41: read tcp HOSTIP:57064->LOADBALANCER IP:6443: use of closed network connection' (may retry after sleeping)

Jun 17 10:18:06 "Failinghostname" kubelet: E0617 10:18:06.683744   16962 controller.go:178] failed to update node lease, error: rpc error: code = Unknown desc = context deadline exceeded

Jun 17 10:18:06 "Failinghostname" kubelet: I0617 10:18:06.683950   16962 controller.go:106] failed to update lease using latest lease, fallback to ensure lease, err: failed 5 attempts to update node lease

Jun 17 10:18:16 "Failinghostname" kubelet: W0617 10:18:16.074492   16962 status_manager.go:556] Failed to get status for pod "kube-controller-manager-"Failinghostname".dc.example.com_kube-system(16c0b106893c92aa29fe120498337172)": etcdserver: request timed out
Jun 17 10:18:16 "Failinghostname" kubelet: E0617 10:18:16.078655   16962 controller.go:136] failed to ensure node lease exists, will retry in 200ms, error: etcdserver: request timed out

Jun 17 10:18:22 "Failinghostname" kubelet: W0617 10:18:22.981834   16962 kubelet_pods.go:858] Unable to retrieve pull secret ingress-controller/ingress-reg-cred for ingress-controller/haproxy-ingress-7q4kt due to secret "ingress-reg-cred" not found.  The image pull may not succeed.
Jun 17 10:18:23 "Failinghostname" kubelet: E0617 10:18:23.073551   16962 event.go:260] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"kube-apiserver-"Failinghostname".dc.example.com.161949b283f69b41", GenerateName:"", Namespace:"kube-system", SelfLink:"", UID:"", ResourceVersion:"3079210", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"kube-apiserver-"Failinghostname".dc.example.com", UID:"5077e0841ae3e87360fb23fd4d80ec7a", APIVersion:"v1", ResourceVersion:"", FieldPath:"spec.containers{kube-apiserver}"}, Reason:"Unhealthy", Message:"Liveness probe failed: HTTP probe failed with statuscode: 500", Source:v1.EventSource{Component:"kubelet", Host:""Failinghostname".dc.example.com"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63727981774, loc:(*time.Location)(0x7016480)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbfb2994f9311a13e, ext:69940133514730, loc:(*time.Location)(0x7016480)}}, Count:2, Type:"Warning", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'etcdserver: request timed out' (will not retry!)

Jun 17 10:18:26 "Failinghostname" kubelet: E0617 10:18:26.279847   16962 controller.go:136] failed to ensure node lease exists, will retry in 400ms, error: Get https://LOADBALANCER IP:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/"Failinghostname".dc.example.com?timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Jun 17 10:18:30 "Failinghostname" kubelet: W0617 10:18:30.078032   16962 status_manager.go:556] Failed to get status for pod "kube-controller-manager-"Failinghostname".dc.example.com_kube-system(16c0b106893c92aa29fe120498337172)": etcdserver: request timed out
Jun 17 10:18:36 "Failinghostname" kubelet: E0617 10:18:36.680315   16962 controller.go:136] failed to ensure node lease exists, will retry in 800ms, error: Get https://LOADBALANCER IP:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/"Failinghostname".dc.example.com?timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Jun 17 10:18:37 "Failinghostname" kubelet: E0617 10:18:37.077075   16962 event.go:260] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"etcd-"Failinghostname".dc.example.com.161947942decd43a", GenerateName:"", Namespace:"kube-system", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"etcd-"Failinghostname".dc.example.com", UID:"88651eea1301b8771da6de75524446cd", APIVersion:"v1", ResourceVersion:"", FieldPath:"spec.containers{etcd}"}, Reason:"Unhealthy", Message:"Liveness probe failed: HTTP probe failed with statuscode: 503", Source:v1.EventSource{Component:"kubelet", Host:""Failinghostname".dc.example.com"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbfb2930d4034023a, ext:63531203087561, loc:(*time.Location)(0x7016480)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbfb2994f97323de4, ext:69940202760835, loc:(*time.Location)(0x7016480)}}, Count:2, Type:"Warning", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'etcdserver: request timed out' (will not retry!)

Jun 17 10:19:14 "Failinghostname" containerd: time="2020-06-17T10:19:14.849721494Z" level=info msg="shim containerd-shim started" address="/containerd-shim/moby/05df8e20a1ff13fe0d76f813bdf37769ae5469aea877bf68c56f5440e09b5c97/shim.sock" debug=false pid=26770
Jun 17 10:19:14 "Failinghostname" kubelet: E0617 10:19:14.886369   16962 event.go:269] Unable to write event: 'Post https://LOADBALANCER IP:6443/api/v1/namespaces/kube-system/events: write tcp controlpalne-1:58708->LOADBALANCER IP:6443: write: connection reset by peer' (may retry after sleeping)
Jun 17 10:19:15 "Failinghostname" kubelet: I0617 10:19:15.576071   16962 topology_manager.go:219] [topologymanager] RemoveContainer - Container ID: 7e8ca8ee96aefb31139d69b0afa543bf3e2f407019c6edd997e6fcbdbe35cd2f
Jun 17 10:19:20 "Failinghostname" kubelet: E0617 10:19:20.626656   16962 kubelet_node_status.go:402] Error updating node status, will retry: error getting node ""Failinghostname".dc.example.com": Get https://LOADBALANCER IP:6443/api/v1/nodes/"Failinghostname".dc.example.com?timeout=10s: context deadline exceeded

Jun 17 10:23:22 "Failinghostname" kubelet: E0617 10:23:22.470185   16962 pod_workers.go:191] Error syncing pod 5077e0841ae3e87360fb23fd4d80ec7a ("kube-apiserver-"Failinghostname".dc.example.com_kube-system(5077e0841ae3e87360fb23fd4d80ec7a)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "back-off 2m40s restarting failed container=kube-apiserver pod=kube-apiserver-"Failinghostname".dc.example.com_kube-system(5077e0841ae3e87360fb23fd4d80ec7a)"

Jun 17 10:23:24 "Failinghostname" kubelet: E0617 10:23:24.606466   16962 controller.go:136] failed to ensure node lease exists, will retry in 7s, error: Get https://LOADBALANCER IP:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/"Failinghostname".dc.example.com?timeout=10s: context deadline exceeded
Jun 17 10:23:26 "Failinghostname" kubelet: W0617 10:23:26.981241   16962 kubelet_pods.go:858] Unable to retrieve pull secret ingress-controller/ingress-reg-cred for ingress-controller/haproxy-ingress-7q4kt due to secret "ingress-reg-cred" not found.  The image pull may not succeed.

4.50:6443/api/v1/namespaces/kube-system/configmaps?fieldSelector=metadata.name%3Dkube-proxy&resourceVersion=3089979: dial tcp LOADBALANCER IP:6443: i/o timeout
Jun 17 10:42:14 "Failinghostname" containerd: time="2020-06-17T10:42:14.671881153Z" level=info msg="shim reaped" id=299b4cd989b7e57b4ce6042f5dce266b3ecb4f2ab309a44e98cfdc04a13df095
Jun 17 10:42:14 "Failinghostname" dockerd: time="2020-06-17T10:42:14.681444552Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Jun 17 10:42:14 "Failinghostname" dockerd: time="2020-06-17T10:42:14.681639723Z" level=warning msg="299b4cd989b7e57b4ce6042f5dce266b3ecb4f2ab309a44e98cfdc04a13df095 cleanup: failed to unmount IPC: umount /var/lib/docker/containers/299b4cd989b7e57b4ce6042f5dce266b3ecb4f2ab309a44e98cfdc04a13df095/mounts/shm, flags: 0x2: no such file or directory"
neolit123 commented 4 years ago

/triage support

hi,

did you use this guide to setup the HA cluster? https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/

Is it the Expected Behaviour when we take 2 kubernetes master node down out of 3

no, it is not.

dial tcp LOADBALANCER IP:6443: i/o timeout

this seems like an issue with your loadbalancer or general network setup.

once you stop the 2 extra control-plane nodes, you can go on the remaining CP node and see what the logs for the etcd and kube-apiserver containers say.

Sjnahak commented 4 years ago

Yes i used https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/ for setting up the cluster.

Please find fresh logs as i recreated by taking down 2 control plane nodes.

Logs of the only one control lane running:

etcd Logs:

[root@Controlplane-1 containers]# tail -f etcd-Controlplane-1.dc.expample.com_kube-system_etcd-d062ac2afda6f3e96c9d7c449c8201a08361d708725b74a01b2eea1623f362d5.log

{"log":"raft2020/06/21 05:43:24 INFO: e121085cb2a6a09a is starting a new election at term 9780\n","stream":"stderr","time":"2020-06-21T05:43:24.141343148Z"}
{"log":"raft2020/06/21 05:43:24 INFO: e121085cb2a6a09a became candidate at term 9781\n","stream":"stderr","time":"2020-06-21T05:43:24.141374961Z"}
{"log":"raft2020/06/21 05:43:24 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9781\n","stream":"stderr","time":"2020-06-21T05:43:24.141380838Z"}
{"log":"raft2020/06/21 05:43:24 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9781\n","stream":"stderr","time":"2020-06-21T05:43:24.141398022Z"}
{"log":"raft2020/06/21 05:43:24 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9781\n","stream":"stderr","time":"2020-06-21T05:43:24.141403057Z"}
{"log":"raft2020/06/21 05:43:25 INFO: e121085cb2a6a09a is starting a new election at term 9781\n","stream":"stderr","time":"2020-06-21T05:43:25.241408765Z"}
{"log":"raft2020/06/21 05:43:25 INFO: e121085cb2a6a09a became candidate at term 9782\n","stream":"stderr","time":"2020-06-21T05:43:25.24145734Z"}
{"log":"raft2020/06/21 05:43:25 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9782\n","stream":"stderr","time":"2020-06-21T05:43:25.241464504Z"}
{"log":"raft2020/06/21 05:43:25 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9782\n","stream":"stderr","time":"2020-06-21T05:43:25.241525452Z"}
{"log":"raft2020/06/21 05:43:25 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9782\n","stream":"stderr","time":"2020-06-21T05:43:25.241535684Z"}
{"log":"raft2020/06/21 05:43:27 INFO: e121085cb2a6a09a is starting a new election at term 9782\n","stream":"stderr","time":"2020-06-21T05:43:27.14138059Z"}
{"log":"raft2020/06/21 05:43:27 INFO: e121085cb2a6a09a became candidate at term 9783\n","stream":"stderr","time":"2020-06-21T05:43:27.141409838Z"}
{"log":"raft2020/06/21 05:43:27 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9783\n","stream":"stderr","time":"2020-06-21T05:43:27.141419706Z"}
{"log":"raft2020/06/21 05:43:27 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9783\n","stream":"stderr","time":"2020-06-21T05:43:27.141427051Z"}
{"log":"raft2020/06/21 05:43:27 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9783\n","stream":"stderr","time":"2020-06-21T05:43:27.141434201Z"}
{"log":"2020-06-21 05:43:27.658287 W | rafthttp: health check for peer 1e1194c275755c28 could not connect: dial tcp Controlplane-2:2380: i/o timeout\n","stream":"stderr","time":"2020-06-21T05:43:27.658458133Z"}
{"log":"2020-06-21 05:43:27.658325 W | rafthttp: health check for peer 1e1194c275755c28 could not connect: dial tcp Controlplane-2:2380: i/o timeout\n","stream":"stderr","time":"2020-06-21T05:43:27.658498514Z"}
{"log":"2020-06-21 05:43:27.663330 W | rafthttp: health check for peer 4e8d639ed37e166c could not connect: dial tcp Controlplane-3:2380: connect: no route to host\n","stream":"stderr","time":"2020-06-21T05:43:27.663426094Z"}
{"log":"2020-06-21 05:43:27.663342 W | rafthttp: health check for peer 4e8d639ed37e166c could not connect: dial tcp Controlplane-3:2380: connect: no route to host\n","stream":"stderr","time":"2020-06-21T05:43:27.663443642Z"}
{"log":"raft2020/06/21 05:43:28 INFO: e121085cb2a6a09a is starting a new election at term 9783\n","stream":"stderr","time":"2020-06-21T05:43:28.941494502Z"}
{"log":"raft2020/06/21 05:43:28 INFO: e121085cb2a6a09a became candidate at term 9784\n","stream":"stderr","time":"2020-06-21T05:43:28.941528859Z"}
{"log":"raft2020/06/21 05:43:28 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9784\n","stream":"stderr","time":"2020-06-21T05:43:28.941536826Z"}
{"log":"raft2020/06/21 05:43:28 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9784\n","stream":"stderr","time":"2020-06-21T05:43:28.941543437Z"}
{"log":"raft2020/06/21 05:43:28 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9784\n","stream":"stderr","time":"2020-06-21T05:43:28.941550219Z"}
{"log":"raft2020/06/21 05:43:30 INFO: e121085cb2a6a09a is starting a new election at term 9784\n","stream":"stderr","time":"2020-06-21T05:43:30.441297593Z"}
{"log":"raft2020/06/21 05:43:30 INFO: e121085cb2a6a09a became candidate at term 9785\n","stream":"stderr","time":"2020-06-21T05:43:30.4413353Z"}
{"log":"raft2020/06/21 05:43:30 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9785\n","stream":"stderr","time":"2020-06-21T05:43:30.441363252Z"}
{"log":"raft2020/06/21 05:43:30 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9785\n","stream":"stderr","time":"2020-06-21T05:43:30.441373151Z"}
{"log":"raft2020/06/21 05:43:30 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9785\n","stream":"stderr","time":"2020-06-21T05:43:30.44138036Z"}
{"log":"2020-06-21 05:43:30.663408 E | etcdserver: publish error: etcdserver: request timed out\n","stream":"stderr","time":"2020-06-21T05:43:30.663602199Z"}
{"log":"raft2020/06/21 05:43:31 INFO: e121085cb2a6a09a is starting a new election at term 9785\n","stream":"stderr","time":"2020-06-21T05:43:31.541458231Z"}
{"log":"raft2020/06/21 05:43:31 INFO: e121085cb2a6a09a became candidate at term 9786\n","stream":"stderr","time":"2020-06-21T05:43:31.541479659Z"}
{"log":"raft2020/06/21 05:43:31 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9786\n","stream":"stderr","time":"2020-06-21T05:43:31.541484539Z"}
{"log":"raft2020/06/21 05:43:31 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9786\n","stream":"stderr","time":"2020-06-21T05:43:31.541488645Z"}
{"log":"raft2020/06/21 05:43:31 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9786\n","stream":"stderr","time":"2020-06-21T05:43:31.541492857Z"}
{"log":"2020-06-21 05:43:32.658445 W | rafthttp: health check for peer 1e1194c275755c28 could not connect: dial tcp Controlplane-2:2380: connect: no route to host\n","stream":"stderr","time":"2020-06-21T05:43:32.658633623Z"}
{"log":"2020-06-21 05:43:32.658490 W | rafthttp: health check for peer 1e1194c275755c28 could not connect: dial tcp Controlplane-2:2380: connect: no route to host\n","stream":"stderr","time":"2020-06-21T05:43:32.658684689Z"}
{"log":"2020-06-21 05:43:32.663443 W | rafthttp: health check for peer 4e8d639ed37e166c could not connect: dial tcp Controlplane-3:2380: connect: no route to host\n","stream":"stderr","time":"2020-06-21T05:43:32.663533063Z"}
{"log":"2020-06-21 05:43:32.663464 W | rafthttp: health check for peer 4e8d639ed37e166c could not connect: dial tcp Controlplane-3:2380: connect: no route to host\n","stream":"stderr","time":"2020-06-21T05:43:32.663555891Z"}
{"log":"raft2020/06/21 05:43:33 INFO: e121085cb2a6a09a is starting a new election at term 9786\n","stream":"stderr","time":"2020-06-21T05:43:33.141371179Z"}
{"log":"raft2020/06/21 05:43:33 INFO: e121085cb2a6a09a became candidate at term 9787\n","stream":"stderr","time":"2020-06-21T05:43:33.14141579Z"}
{"log":"raft2020/06/21 05:43:33 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9787\n","stream":"stderr","time":"2020-06-21T05:43:33.141422183Z"}
{"log":"raft2020/06/21 05:43:33 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9787\n","stream":"stderr","time":"2020-06-21T05:43:33.141426533Z"}
{"log":"raft2020/06/21 05:43:33 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9787\n","stream":"stderr","time":"2020-06-21T05:43:33.1414308Z"}
{"log":"raft2020/06/21 05:43:35 INFO: e121085cb2a6a09a is starting a new election at term 9787\n","stream":"stderr","time":"2020-06-21T05:43:35.041455138Z"}
{"log":"raft2020/06/21 05:43:35 INFO: e121085cb2a6a09a became candidate at term 9788\n","stream":"stderr","time":"2020-06-21T05:43:35.041489405Z"}
{"log":"raft2020/06/21 05:43:35 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9788\n","stream":"stderr","time":"2020-06-21T05:43:35.041498094Z"}
{"log":"raft2020/06/21 05:43:35 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9788\n","stream":"stderr","time":"2020-06-21T05:43:35.041504898Z"}
{"log":"raft2020/06/21 05:43:35 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9788\n","stream":"stderr","time":"2020-06-21T05:43:35.041529741Z"}

{"log":"raft2020/06/21 05:43:36 INFO: e121085cb2a6a09a is starting a new election at term 9788\n","stream":"stderr","time":"2020-06-21T05:43:36.841390834Z"}
{"log":"raft2020/06/21 05:43:36 INFO: e121085cb2a6a09a became candidate at term 9789\n","stream":"stderr","time":"2020-06-21T05:43:36.841426128Z"}
{"log":"raft2020/06/21 05:43:36 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9789\n","stream":"stderr","time":"2020-06-21T05:43:36.841431745Z"}
{"log":"raft2020/06/21 05:43:36 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9789\n","stream":"stderr","time":"2020-06-21T05:43:36.84143597Z"}
{"log":"raft2020/06/21 05:43:36 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9789\n","stream":"stderr","time":"2020-06-21T05:43:36.841440142Z"}
{"log":"2020-06-21 05:43:37.658650 W | rafthttp: health check for peer 1e1194c275755c28 could not connect: dial tcp Controlplane-2:2380: i/o timeout\n","stream":"stderr","time":"2020-06-21T05:43:37.658809222Z"}
{"log":"2020-06-21 05:43:37.658710 W | rafthttp: health check for peer 1e1194c275755c28 could not connect: dial tcp Controlplane-2:2380: i/o timeout\n","stream":"stderr","time":"2020-06-21T05:43:37.658843049Z"}
{"log":"2020-06-21 05:43:37.664479 W | rafthttp: health check for peer 4e8d639ed37e166c could not connect: dial tcp Controlplane-3:2380: connect: no route to host\n","stream":"stderr","time":"2020-06-21T05:43:37.665574509Z"}
{"log":"2020-06-21 05:43:37.664522 E | etcdserver: publish error: etcdserver: request timed out\n","stream":"stderr","time":"2020-06-21T05:43:37.665602088Z"}
{"log":"2020-06-21 05:43:37.664553 W | rafthttp: health check for peer 4e8d639ed37e166c could not connect: dial tcp Controlplane-3:2380: connect: no route to host\n","stream":"stderr","time":"2020-06-21T05:43:37.665607627Z"}
{"log":"raft2020/06/21 05:43:37 INFO: e121085cb2a6a09a is starting a new election at term 9789\n","stream":"stderr","time":"2020-06-21T05:43:37.84135358Z"}
{"log":"raft2020/06/21 05:43:37 INFO: e121085cb2a6a09a became candidate at term 9790\n","stream":"stderr","time":"2020-06-21T05:43:37.841386108Z"}
{"log":"raft2020/06/21 05:43:37 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9790\n","stream":"stderr","time":"2020-06-21T05:43:37.841391804Z"}
{"log":"raft2020/06/21 05:43:37 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9790\n","stream":"stderr","time":"2020-06-21T05:43:37.841396252Z"}
{"log":"raft2020/06/21 05:43:37 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9790\n","stream":"stderr","time":"2020-06-21T05:43:37.841400356Z"}
{"log":"raft2020/06/21 05:43:39 INFO: e121085cb2a6a09a is starting a new election at term 9790\n","stream":"stderr","time":"2020-06-21T05:43:39.441340782Z"}
{"log":"raft2020/06/21 05:43:39 INFO: e121085cb2a6a09a became candidate at term 9791\n","stream":"stderr","time":"2020-06-21T05:43:39.441366044Z"}
{"log":"raft2020/06/21 05:43:39 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9791\n","stream":"stderr","time":"2020-06-21T05:43:39.44137171Z"}
{"log":"raft2020/06/21 05:43:39 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9791\n","stream":"stderr","time":"2020-06-21T05:43:39.441376224Z"}
{"log":"raft2020/06/21 05:43:39 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9791\n","stream":"stderr","time":"2020-06-21T05:43:39.441380469Z"}
{"log":"raft2020/06/21 05:43:41 INFO: e121085cb2a6a09a is starting a new election at term 9791\n","stream":"stderr","time":"2020-06-21T05:43:41.141415741Z"}
{"log":"raft2020/06/21 05:43:41 INFO: e121085cb2a6a09a became candidate at term 9792\n","stream":"stderr","time":"2020-06-21T05:43:41.141473253Z"}
{"log":"raft2020/06/21 05:43:41 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9792\n","stream":"stderr","time":"2020-06-21T05:43:41.141481161Z"}
{"log":"raft2020/06/21 05:43:41 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9792\n","stream":"stderr","time":"2020-06-21T05:43:41.141485381Z"}
{"log":"raft2020/06/21 05:43:41 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9792\n","stream":"stderr","time":"2020-06-21T05:43:41.141489426Z"}
{"log":"raft2020/06/21 05:43:42 INFO: e121085cb2a6a09a is starting a new election at term 9792\n","stream":"stderr","time":"2020-06-21T05:43:42.141441819Z"}
{"log":"raft2020/06/21 05:43:42 INFO: e121085cb2a6a09a became candidate at term 9793\n","stream":"stderr","time":"2020-06-21T05:43:42.141486181Z"}
{"log":"raft2020/06/21 05:43:42 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9793\n","stream":"stderr","time":"2020-06-21T05:43:42.141495332Z"}
{"log":"raft2020/06/21 05:43:42 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9793\n","stream":"stderr","time":"2020-06-21T05:43:42.141502797Z"}
{"log":"raft2020/06/21 05:43:42 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9793\n","stream":"stderr","time":"2020-06-21T05:43:42.141509957Z"}
{"log":"2020-06-21 05:43:42.658821 W | rafthttp: health check for peer 1e1194c275755c28 could not connect: dial tcp Controlplane-2:2380: connect: no route to host\n","stream":"stderr","time":"2020-06-21T05:43:42.659003374Z"}
{"log":"2020-06-21 05:43:42.658860 W | rafthttp: health check for peer 1e1194c275755c28 could not connect: dial tcp Controlplane-2:2380: connect: no route to host\n","stream":"stderr","time":"2020-06-21T05:43:42.659039568Z"}
{"log":"2020-06-21 05:43:42.664595 W | rafthttp: health check for peer 4e8d639ed37e166c could not connect: dial tcp Controlplane-3:2380: connect: no route to host\n","stream":"stderr","time":"2020-06-21T05:43:42.664664244Z"}
{"log":"2020-06-21 05:43:42.664618 W | rafthttp: health check for peer 4e8d639ed37e166c could not connect: dial tcp Controlplane-3:2380: connect: no route to host\n","stream":"stderr","time":"2020-06-21T05:43:42.66467923Z"}
{"log":"raft2020/06/21 05:43:44 INFO: e121085cb2a6a09a is starting a new election at term 9793\n","stream":"stderr","time":"2020-06-21T05:43:44.04152621Z"}
{"log":"raft2020/06/21 05:43:44 INFO: e121085cb2a6a09a became candidate at term 9794\n","stream":"stderr","time":"2020-06-21T05:43:44.041559581Z"}
{"log":"raft2020/06/21 05:43:44 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9794\n","stream":"stderr","time":"2020-06-21T05:43:44.04156546Z"}
{"log":"raft2020/06/21 05:43:44 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9794\n","stream":"stderr","time":"2020-06-21T05:43:44.041569669Z"}
{"log":"raft2020/06/21 05:43:44 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9794\n","stream":"stderr","time":"2020-06-21T05:43:44.041573915Z"}
{"log":"2020-06-21 05:43:44.664759 E | etcdserver: publish error: etcdserver: request timed out\n","stream":"stderr","time":"2020-06-21T05:43:44.664913213Z"}
{"log":"raft2020/06/21 05:43:45 INFO: e121085cb2a6a09a is starting a new election at term 9794\n","stream":"stderr","time":"2020-06-21T05:43:45.241587562Z"}
{"log":"raft2020/06/21 05:43:45 INFO: e121085cb2a6a09a became candidate at term 9795\n","stream":"stderr","time":"2020-06-21T05:43:45.241614069Z"}
{"log":"raft2020/06/21 05:43:45 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9795\n","stream":"stderr","time":"2020-06-21T05:43:45.241619246Z"}
{"log":"raft2020/06/21 05:43:45 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9795\n","stream":"stderr","time":"2020-06-21T05:43:45.241623426Z"}
{"log":"raft2020/06/21 05:43:45 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9795\n","stream":"stderr","time":"2020-06-21T05:43:45.241639668Z"}
{"log":"raft2020/06/21 05:43:46 INFO: e121085cb2a6a09a is starting a new election at term 9795\n","stream":"stderr","time":"2020-06-21T05:43:46.241388215Z"}
{"log":"raft2020/06/21 05:43:46 INFO: e121085cb2a6a09a became candidate at term 9796\n","stream":"stderr","time":"2020-06-21T05:43:46.241422216Z"}
{"log":"raft2020/06/21 05:43:46 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9796\n","stream":"stderr","time":"2020-06-21T05:43:46.241427405Z"}
{"log":"raft2020/06/21 05:43:46 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9796\n","stream":"stderr","time":"2020-06-21T05:43:46.241431516Z"}
{"log":"raft2020/06/21 05:43:46 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9796\n","stream":"stderr","time":"2020-06-21T05:43:46.241435615Z"}
{"log":"raft2020/06/21 05:43:47 INFO: e121085cb2a6a09a is starting a new election at term 9796\n","stream":"stderr","time":"2020-06-21T05:43:47.541407749Z"}
{"log":"raft2020/06/21 05:43:47 INFO: e121085cb2a6a09a became candidate at term 9797\n","stream":"stderr","time":"2020-06-21T05:43:47.541433859Z"}
{"log":"raft2020/06/21 05:43:47 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9797\n","stream":"stderr","time":"2020-06-21T05:43:47.541438969Z"}
{"log":"raft2020/06/21 05:43:47 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9797\n","stream":"stderr","time":"2020-06-21T05:43:47.541443065Z"}
{"log":"raft2020/06/21 05:43:47 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9797\n","stream":"stderr","time":"2020-06-21T05:43:47.541447107Z"}
{"log":"2020-06-21 05:43:47.658952 W | rafthttp: health check for peer 1e1194c275755c28 could not connect: dial tcp Controlplane-2:2380: connect: no route to host\n","stream":"stderr","time":"2020-06-21T05:43:47.659097354Z"}
{"log":"2020-06-21 05:43:47.658984 W | rafthttp: health check for peer 1e1194c275755c28 could not connect: dial tcp Controlplane-2:2380: connect: no route to host\n","stream":"stderr","time":"2020-06-21T05:43:47.659135444Z"}
{"log":"2020-06-21 05:43:47.664752 W | rafthttp: health check for peer 4e8d639ed37e166c could not connect: dial tcp Controlplane-3:2380: i/o timeout\n","stream":"stderr","time":"2020-06-21T05:43:47.664862298Z"}
{"log":"2020-06-21 05:43:47.664780 W | rafthttp: health check for peer 4e8d639ed37e166c could not connect: dial tcp Controlplane-3:2380: i/o timeout\n","stream":"stderr","time":"2020-06-21T05:43:47.664880574Z"}
{"log":"raft2020/06/21 05:43:49 INFO: e121085cb2a6a09a is starting a new election at term 9797\n","stream":"stderr","time":"2020-06-21T05:43:49.041497721Z"}
{"log":"raft2020/06/21 05:43:49 INFO: e121085cb2a6a09a became candidate at term 9798\n","stream":"stderr","time":"2020-06-21T05:43:49.041527537Z"}
{"log":"raft2020/06/21 05:43:49 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9798\n","stream":"stderr","time":"2020-06-21T05:43:49.041532795Z"}
{"log":"raft2020/06/21 05:43:49 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9798\n","stream":"stderr","time":"2020-06-21T05:43:49.041537036Z"}
{"log":"raft2020/06/21 05:43:49 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9798\n","stream":"stderr","time":"2020-06-21T05:43:49.041541162Z"}
{"log":"raft2020/06/21 05:43:50 INFO: e121085cb2a6a09a is starting a new election at term 9798\n","stream":"stderr","time":"2020-06-21T05:43:50.341354995Z"}
{"log":"raft2020/06/21 05:43:50 INFO: e121085cb2a6a09a became candidate at term 9799\n","stream":"stderr","time":"2020-06-21T05:43:50.341374296Z"}
{"log":"raft2020/06/21 05:43:50 INFO: e121085cb2a6a09a received MsgVoteResp from e121085cb2a6a09a at term 9799\n","stream":"stderr","time":"2020-06-21T05:43:50.341395583Z"}
{"log":"raft2020/06/21 05:43:50 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 1e1194c275755c28 at term 9799\n","stream":"stderr","time":"2020-06-21T05:43:50.341401232Z"}
{"log":"raft2020/06/21 05:43:50 INFO: e121085cb2a6a09a [logterm: 9637, index: 5290395] sent MsgVote request to 4e8d639ed37e166c at term 9799\n","stream":"stderr","time":"2020-06-21T05:43:50.341405321Z"}

Kubeapi-server Logs:

[root@Controlplane-1 containers]# tail -f kube-apiserver-Controlplane-1.dc.example.com_kube-system_kube-apiserver-6c321a045d4d50225b575075d21ab8876c1c4f8c015d2c38455bd756c5d7f3c5.log
{"log":"W0621 05:41:17.998617       1 clientconn.go:1208] grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379  \u003cnil\u003e 0 \u003cnil\u003e}. Err :connection error: desc = \"transport: authentication handshake failed: context deadline exceeded\". Reconnecting...\n","stream":"stderr","time":"2020-06-21T05:41:17.999238569Z"}
{"log":"W0621 05:41:17.998681       1 clientconn.go:1208] grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379  \u003cnil\u003e 0 \u003cnil\u003e}. Err :connection error: desc = \"transport: authentication handshake failed: context deadline exceeded\". Reconnecting...\n","stream":"stderr","time":"2020-06-21T05:41:17.99924304Z"}
{"log":"W0621 05:41:17.998751       1 clientconn.go:1208] grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379  \u003cnil\u003e 0 \u003cnil\u003e}. Err :connection error: desc = \"transport: authentication handshake failed: context deadline exceeded\". Reconnecting...\n","stream":"stderr","time":"2020-06-21T05:41:17.999247465Z"}
{"log":"W0621 05:41:17.998804       1 clientconn.go:1208] grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379  \u003cnil\u003e 0 \u003cnil\u003e}. Err :connection error: desc = \"transport: authentication handshake failed: context deadline exceeded\". Reconnecting...\n","stream":"stderr","time":"2020-06-21T05:41:17.99925197Z"}
{"log":"W0621 05:41:17.998855       1 clientconn.go:1208] grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379  \u003cnil\u003e 0 \u003cnil\u003e}. Err :connection error: desc = \"transport: authentication handshake failed: context deadline exceeded\". Reconnecting...\n","stream":"stderr","time":"2020-06-21T05:41:17.999256879Z"}
{"log":"W0621 05:41:17.998955       1 clientconn.go:1208] grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379  \u003cnil\u003e 0 \u003cnil\u003e}. Err :connection error: desc = \"transport: authentication handshake failed: context deadline exceeded\". Reconnecting...\n","stream":"stderr","time":"2020-06-21T05:41:17.99926135Z"}
{"log":"W0621 05:41:17.999010       1 clientconn.go:1208] grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379  \u003cnil\u003e 0 \u003cnil\u003e}. Err :connection error: desc = \"transport: authentication handshake failed: context deadline exceeded\". Reconnecting...\n","stream":"stderr","time":"2020-06-21T05:41:17.999265741Z"}
{"log":"W0621 05:41:17.999063       1 clientconn.go:1208] grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379  \u003cnil\u003e 0 \u003cnil\u003e}. Err :connection error: desc = \"transport: authentication handshake failed: context deadline exceeded\". Reconnecting...\n","stream":"stderr","time":"2020-06-21T05:41:17.999270179Z"}
{"log":"W0621 05:41:17.999116       1 clientconn.go:1208] grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379  \u003cnil\u003e 0 \u003cnil\u003e}. Err :connection error: desc = \"transport: authentication handshake failed: context deadline exceeded\". Reconnecting...\n","stream":"stderr","time":"2020-06-21T05:41:17.9992785Z"}
{"log":"W0621 05:41:26.880534       1 controller.go:193] RemoveEndpoints() timed out\n","stream":"stderr","time":"2020-06-21T05:41:26.88142931Z"}

[root@Controlplane-1 containers]# tail -f kube-apiserver-Controlplane-1.dc.example.com_kube-system_kube-apiserver-a8487b42d022fa3dfa3638318d6a1628387c26ac63386c821a80c8157e646eab.log
{"log":"I0621 05:42:07.340750       1 server.go:656] external host was not specified, using controlplane-1\n","stream":"stderr","time":"2020-06-21T05:42:07.341026172Z"}
{"log":"I0621 05:42:07.341081       1 server.go:153] Version: v1.18.0\n","stream":"stderr","time":"2020-06-21T05:42:07.341628442Z"}
{"log":"I0621 05:42:08.001284       1 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.\n","stream":"stderr","time":"2020-06-21T05:42:08.002062757Z"}
{"log":"I0621 05:42:08.001310       1 plugins.go:161] Loaded 10 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota.\n","stream":"stderr","time":"2020-06-21T05:42:08.002101204Z"}
{"log":"I0621 05:42:08.002485       1 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.\n","stream":"stderr","time":"2020-06-21T05:42:08.005035755Z"}
{"log":"I0621 05:42:08.002503       1 plugins.go:161] Loaded 10 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota.\n","stream":"stderr","time":"2020-06-21T05:42:08.00505386Z"}
{"log":"I0621 05:42:08.005549       1 client.go:361] parsed scheme: \"endpoint\"\n","stream":"stderr","time":"2020-06-21T05:42:08.005641159Z"}
{"log":"I0621 05:42:08.005595       1 endpoint.go:68] ccResolverWrapper: sending new addresses to cc: [{https://127.0.0.1:2379  \u003cnil\u003e 0 \u003cnil\u003e}]\n","stream":"stderr","time":"2020-06-21T05:42:08.00565401Z"}
{"log":"I0621 05:42:08.999453       1 client.go:361] parsed scheme: \"endpoint\"\n","stream":"stderr","time":"2020-06-21T05:42:09.00113692Z"}
{"log":"I0621 05:42:08.999490       1 endpoint.go:68] ccResolverWrapper: sending new addresses to cc: [{https://127.0.0.1:2379  \u003cnil\u003e 0 \u003cnil\u003e}]\n","stream":"stderr","time":"2020-06-21T05:42:09.001169159Z"}
{"log":"W0621 05:42:20.898343       1 clientconn.go:1208] grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379  \u003cnil\u003e 0 \u003cnil\u003e}. Err :connection error: desc = \"transport: authentication handshake failed: read tcp 127.0.0.1:40300-\u003e127.0.0.1:2379: read: connection timed out\". Reconnecting...\n","stream":"stderr","time":"2020-06-21T05:42:20.898634422Z"}
{"log":"W0621 05:42:21.890119       1 clientconn.go:1208] grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379  \u003cnil\u003e 0 \u003cnil\u003e}. Err :connection error: desc = \"transport: authentication handshake failed: read tcp 127.0.0.1:40322-\u003e127.0.0.1:2379: read: connection timed out\". Reconnecting...\n","stream":"stderr","time":"2020-06-21T05:42:21.890333144Z"}
{"log":"panic: context deadline exceeded\n","stream":"stderr","time":"2020-06-21T05:42:28.008794275Z"}
{"log":"\n","stream":"stderr","time":"2020-06-21T05:42:28.008853428Z"}
{"log":"goroutine 1 [running]:\n","stream":"stderr","time":"2020-06-21T05:42:28.008878475Z"}
{"log":"k8s.io/kubernetes/vendor/k8s.io/apiextensions-apiserver/pkg/registry/customresourcedefinition.NewREST(0xc00055c9a0, 0x50df5c0, 0xc0002f3680, 0xc0002f39c8)\n","stream":"stderr","time":"2020-06-21T05:42:28.008886656Z"}
{"log":"\u0009/workspace/anago-v1.18.0-rc.1.21+8be33caaf953ac/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiextensions-apiserver/pkg/registry/customresourcedefinition/etcd.go:56 +0x3e7\n","stream":"stderr","time":"2020-06-21T05:42:28.009044126Z"}
{"log":"k8s.io/kubernetes/vendor/k8s.io/apiextensions-apiserver/pkg/apiserver.completedConfig.New(0xc00000dec0, 0xc000214d88, 0x519de20, 0x773a7d8, 0x10, 0x0, 0x0)\n","stream":"stderr","time":"2020-06-21T05:42:28.009059525Z"}
{"log":"\u0009/workspace/anago-v1.18.0-rc.1.21+8be33caaf953ac/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiextensions-apiserver/pkg/apiserver/apiserver.go:145 +0x14ef\n","stream":"stderr","time":"2020-06-21T05:42:28.009066604Z"}
{"log":"k8s.io/kubernetes/cmd/kube-apiserver/app.createAPIExtensionsServer(0xc000214d80, 0x519de20, 0x773a7d8, 0x0, 0x50df180, 0xc0004114c0)\n","stream":"stderr","time":"2020-06-21T05:42:28.009073262Z"}
{"log":"\u0009/workspace/anago-v1.18.0-rc.1.21+8be33caaf953ac/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kube-apiserver/app/apiextensions.go:102 +0x59\n","stream":"stderr","time":"2020-06-21T05:42:28.009079671Z"}
{"log":"k8s.io/kubernetes/cmd/kube-apiserver/app.CreateServerChain(0xc0008642c0, 0xc000096fc0, 0x4558d31, 0xc, 0xc000a33c48)\n","stream":"stderr","time":"2020-06-21T05:42:28.009086286Z"}
{"log":"\u0009/workspace/anago-v1.18.0-rc.1.21+8be33caaf953ac/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kube-apiserver/app/server.go:186 +0x2b8\n","stream":"stderr","time":"2020-06-21T05:42:28.009092695Z"}
{"log":"k8s.io/kubernetes/cmd/kube-apiserver/app.Run(0xc0008642c0, 0xc000096fc0, 0x0, 0x0)\n","stream":"stderr","time":"2020-06-21T05:42:28.009099756Z"}
{"log":"\u0009/workspace/anago-v1.18.0-rc.1.21+8be33caaf953ac/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kube-apiserver/app/server.go:155 +0x101\n","stream":"stderr","time":"2020-06-21T05:42:28.009106216Z"}
{"log":"k8s.io/kubernetes/cmd/kube-apiserver/app.NewAPIServerCommand.func1(0xc000822a00, 0xc0008f4340, 0x0, 0x1a, 0x0, 0x0)\n","stream":"stderr","time":"2020-06-21T05:42:28.009113141Z"}
{"log":"\u0009/workspace/anago-v1.18.0-rc.1.21+8be33caaf953ac/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kube-apiserver/app/server.go:122 +0x104\n","stream":"stderr","time":"2020-06-21T05:42:28.009119718Z"}
{"log":"k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute(0xc000822a00, 0xc0000bc010, 0x1a, 0x1b, 0xc000822a00, 0xc0000bc010)\n","stream":"stderr","time":"2020-06-21T05:42:28.009126808Z"}
{"log":"\u0009/workspace/anago-v1.18.0-rc.1.21+8be33caaf953ac/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:826 +0x460\n","stream":"stderr","time":"2020-06-21T05:42:28.009135522Z"}
{"log":"k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc000822a00, 0x161a78b2bf69640e, 0x771c600, 0xc00006a750)\n","stream":"stderr","time":"2020-06-21T05:42:28.009142792Z"}
{"log":"\u0009/workspace/anago-v1.18.0-rc.1.21+8be33caaf953ac/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:914 +0x2fb\n","stream":"stderr","time":"2020-06-21T05:42:28.009149636Z"}
{"log":"k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute(...)\n","stream":"stderr","time":"2020-06-21T05:42:28.009158131Z"}
{"log":"\u0009/workspace/anago-v1.18.0-rc.1.21+8be33caaf953ac/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:864\n","stream":"stderr","time":"2020-06-21T05:42:28.009166156Z"}
{"log":"main.main()\n","stream":"stderr","time":"2020-06-21T05:42:28.009173728Z"}
{"log":"\u0009_output/dockerized/go/src/k8s.io/kubernetes/cmd/kube-apiserver/apiserver.go:43 +0xcd\n","stream":"stderr","time":"2020-06-21T05:42:28.009180291Z"}

[root@Contolplane-1 containers]# journalctl -fu kubelet -- Logs begin at Wed 2020-06-17 13:44:57 UTC. -- Jun 21 05:45:28 Contolplane-1.dc.example.com kubelet[921]: Trace[1761897959]: [30.000448733s] [30.000448733s] END Jun 21 05:45:28 Contolplane-1.dc.example.com kubelet[921]: E0621 05:45:28.152325 921 reflector.go:178] object-"ingress-controller"/"ingress-reg-cred": Failed to list v1.Secret: Get https://LOADBALANCER-IP:6443/api/v1/namespaces/ingress-controller/secrets?fieldSelector=metadata.name%3Dingress-reg-cred&resourceVersion=3550993: dial tcp LOADBALANCER-IP:6443: i/o timeout Jun 21 05:45:30 Contolplane-1.dc.example.com kubelet[921]: E0621 05:45:30.060686 921 controller.go:136] failed to ensure node lease exists, will retry in 7s, error: Get https://LOADBALANCER-IP:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/Contolplane-1.dc.example.com?timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers) Jun 21 05:45:32 Contolplane-1.dc.example.com kubelet[921]: E0621 05:45:32.112239 921 summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service" Jun 21 05:45:34 Contolplane-1.dc.example.com kubelet[921]: I0621 05:45:34.995828 921 topology_manager.go:219] [topologymanager] RemoveContainer - Container ID: 660944b97f03b9423ee3dd3d34109599d3c3b33261b7532832dac2b7fb8837e4 Jun 21 05:45:34 Contolplane-1.dc.example.com kubelet[921]: E0621 05:45:34.996432 921 pod_workers.go:191] Error syncing pod 5077e0841ae3e87360fb23fd4d80ec7a ("kube-apiserver-Contolplane-1.dc.example.com_kube-system(5077e0841ae3e87360fb23fd4d80ec7a)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "back-off 1m20s restarting failed container=kube-apiserver pod=kube-apiserver-Contolplane-1.dc.example.com_kube-system(5077e0841ae3e87360fb23fd4d80ec7a)" Jun 21 05:45:42 Contolplane-1.dc.example.com kubelet[921]: E0621 05:45:42.132371 921 summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service" Jun 21 05:45:46 Contolplane-1.dc.example.com kubelet[921]: E0621 05:45:46.914288 921 kubelet_node_status.go:402] Error updating node status, will retry: error getting node "Contolplane-1.dc.example.com": Get https://LOADBALANCER-IP:6443/api/v1/nodes/Contolplane-1.dc.example.com?resourceVersion=0&timeout=10s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) Jun 21 05:45:46 Contolplane-1.dc.example.com kubelet[921]: I0621 05:45:46.995872 921 topology_manager.go:219] [topologymanager] RemoveContainer - Container ID: 660944b97f03b9423ee3dd3d34109599d3c3b33261b7532832dac2b7fb8837e4 Jun 21 05:45:47 Contolplane-1.dc.example.com kubelet[921]: E0621 05:45:47.061159 921 controller.go:136] failed to ensure node lease exists, will retry in 7s, error: Get https://LOADBALANCER-IP:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/Contolplane-1.dc.example.com?timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers) Jun 21 05:45:52 Contolplane-1.dc.example.com kubelet[921]: E0621 05:45:52.148670 921 summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service" Jun 21 05:45:56 Contolplane-1.dc.example.com kubelet[921]: W0621 05:45:56.552891 921 status_manager.go:556] Failed to get status for pod "kube-apiserver-Contolplane-1.dc.example.com_kube-system(5077e0841ae3e87360fb23fd4d80ec7a)": Get https://LOADBALANCER-IP:6443/api/v1/namespaces/kube-system/pods/kube-apiserver-Contolplane-1.dc.example.com: dial tcp LOADBALANCER-IP:6443: i/o timeout Jun 21 05:45:56 Contolplane-1.dc.example.com kubelet[921]: E0621 05:45:56.914821 921 kubelet_node_status.go:402] Error updating node status, will retry: error getting node "Contolplane-1.dc.example.com": Get https://LOADBALANCER-IP:6443/api/v1/nodes/Contolplane-1.dc.example.com?timeout=10s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) Jun 21 05:45:58 Contolplane-1.dc.example.com kubelet[921]: I0621 05:45:58.001419 921 trace.go:116] Trace[356205815]: "Reflector ListAndWatch" name:object-"kube-system"/"flannel-token-2lxkt" (started: 2020-06-21 05:45:28.000773366 +0000 UTC m=+316815.175190059) (total time: 30.00061s): Jun 21 05:45:58 Contolplane-1.dc.example.com kubelet[921]: Trace[356205815]: [30.00061s] [30.00061s] END Jun 21 05:45:58 Contolplane-1.dc.example.com kubelet[921]: E0621 05:45:58.001453 921 reflector.go:178] object-"kube-system"/"flannel-token-2lxkt": Failed to list v1.Secret: Get https://LOADBALANCER-IP:6443/api/v1/namespaces/kube-system/secrets?fieldSelector=metadata.name%3Dflannel-token-2lxkt&resourceVersion=3550993: dial tcp LOADBALANCER-IP:6443: i/o timeout Jun 21 05:46:00 Contolplane-1.dc.example.com kubelet[921]: E0621 05:46:00.752124 921 event.go:269] Unable to write event: 'Patch https://LOADBALANCER-IP:6443/api/v1/namespaces/kube-system/events/kube-apiserver-Contolplane-1.dc.example.com.161961d1629d3e8e: dial tcp LOADBALANCER-IP:6443: i/o timeout' (may retry after sleeping) Jun 21 05:46:02 Contolplane-1.dc.example.com kubelet[921]: E0621 05:46:02.170456 921 summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service" Jun 21 05:46:04 Contolplane-1.dc.example.com kubelet[921]: I0621 05:46:04.059745 921 trace.go:116] Trace[1079606920]: "Reflector ListAndWatch" name:object-"kube-system"/"kube-flannel-cfg" (started: 2020-06-21 05:45:34.059253873 +0000 UTC m=+316821.233670531) (total time: 30.000454883s): Jun 21 05:46:04 Contolplane-1.dc.example.com kubelet[921]: Trace[1079606920]: [30.000454883s] [30.000454883s] END Jun 21 05:46:04 Contolplane-1.dc.example.com kubelet[921]: E0621 05:46:04.061047 921 reflector.go:178] object-"kube-system"/"kube-flannel-cfg": Failed to list v1.ConfigMap: Get https://LOADBALANCER-IP:6443/api/v1/namespaces/kube-system/configmaps?fieldSelector=metadata.name%3Dkube-flannel-cfg&resourceVersion=3987933: dial tcp LOADBALANCER-IP:6443: i/o timeout Jun 21 05:46:04 Contolplane-1.dc.example.com kubelet[921]: E0621 05:46:04.064178 921 controller.go:136] failed to ensure node lease exists, will retry in 7s, error: Get https://LOADBALANCER-IP:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/Contolplane-1.dc.example.com?timeout=10s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) Jun 21 05:46:04 Contolplane-1.dc.example.com kubelet[921]: W0621 05:46:04.995820 921 kubelet_pods.go:858] Unable to retrieve pull secret ingress-controller/ingress-reg-cred for ingress-controller/haproxy-ingress-7q4kt due to secret "ingress-reg-cred" not found. The image pull may not succeed. Jun 21 05:46:05 Contolplane-1.dc.example.com kubelet[921]: I0621 05:46:05.437366 921 trace.go:116] Trace[959276694]: "Reflector ListAndWatch" name:k8s.io/kubernetes/pkg/kubelet/kubelet.go:526 (started: 2020-06-21 05:45:35.436707199 +0000 UTC m=+316822.611123973) (total time: 30.000631844s): Jun 21 05:46:05 Contolplane-1.dc.example.com kubelet[921]: Trace[959276694]: [30.000631844s] [30.000631844s] END Jun 21 05:46:05 Contolplane-1.dc.example.com kubelet[921]: E0621 05:46:05.437391 921 reflector.go:178] k8s.io/kubernetes/pkg/kubelet/kubelet.go:526: Failed to list v1.Node: Get https://LOADBALANCER-IP:6443/api/v1/nodes?fieldSelector=metadata.name%3DContolplane-1.dc.example.com&resourceVersion=3987866: dial tcp LOADBALANCER-IP:6443: i/o timeout Jun 21 05:46:06 Contolplane-1.dc.example.com kubelet[921]: E0621 05:46:06.915897 921 kubelet_node_status.go:402] Error updating node status, will retry: error getting node "Contolplane-1.dc.example.com": Get https://LOADBALANCER-IP:6443/api/v1/nodes/Contolplane-1.dc.example.com?timeout=10s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) Jun 21 05:46:08 Contolplane-1.dc.example.com kubelet[921]: I0621 05:46:08.952422 921 topology_manager.go:219] [topologymanager] RemoveContainer - Container ID: 660944b97f03b9423ee3dd3d34109599d3c3b33261b7532832dac2b7fb8837e4 Jun 21 05:46:08 Contolplane-1.dc.example.com kubelet[921]: I0621 05:46:08.953324 921 topology_manager.go:219] [topologymanager] RemoveContainer - Container ID: 7d6f27a2a48e9fb030d1683d6519b8bd6aa8947210d7ca7806e12021eaa7f504 Jun 21 05:46:08 Contolplane-1.dc.example.com kubelet[921]: E0621 05:46:08.953794 921 pod_workers.go:191] Error syncing pod 5077e0841ae3e87360fb23fd4d80ec7a ("kube-apiserver-Contolplane-1.dc.example.com_kube-system(5077e0841ae3e87360fb23fd4d80ec7a)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "back-off 2m40s restarting failed container=kube-apiserver pod=kube-apiserver-Contolplane-1.dc.example.com_kube-system(5077e0841ae3e87360fb23fd4d80ec7a)" Jun 21 05:46:10 Contolplane-1.dc.example.com kubelet[921]: E0621 05:46:10.754065 921 event.go:269] Unable to write event: 'Patch https://LOADBALANCER-IP:6443/api/v1/namespaces/kube-system/events/kube-apiserver-Contolplane-1.dc.example.com.161961d1629d3e8e: read tcp contolplane-1:39482->LOADBALANCER-IP:6443: read: connection reset by peer' (may retry after sleeping) Jun 21 05:46:10 Contolplane-1.dc.example.com kubelet[921]: I0621 05:46:10.818647 921 trace.go:116] Trace[761471271]: "Reflector ListAndWatch" name:k8s.io/kubernetes/pkg/kubelet/kubelet.go:517 (started: 2020-06-21 05:45:40.818119385 +0000 UTC m=+316827.992536089) (total time: 30.000493261s): Jun 21 05:46:10 Contolplane-1.dc.example.com kubelet[921]: Trace[761471271]: [30.000493261s] [30.000493261s] END Jun 21 05:46:10 Contolplane-1.dc.example.com kubelet[921]: E0621 05:46:10.818670 921 reflector.go:178] k8s.io/kubernetes/pkg/kubelet/kubelet.go:517: Failed to list *v1.Service: Get https://LOADBALANCER-IP:6443/api/v1/services?resourceVersion=3365567: dial tcp LOADBALANCER-IP:6443: i/o timeout Jun 21 05:46:12 Contolplane-1.dc.example.com kubelet[921]: E0621 05:46:12.197203 921 summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service" Jun 21 05:46:14 Contolplane-1.dc.example.com kubelet[921]: I0621 05:46:14.812441 921 topology_manager.go:219] [topologymanager] RemoveContainer - Container ID: 7d6f27a2a48e9fb030d1683d6519b8bd6aa8947210d7ca7806e12021eaa7f504 Jun 21 05:46:14 Contolplane-1.dc.example.com kubelet[921]: E0621 05:46:14.812997 921 pod_workers.go:191] Error syncing pod 5077e0841ae3e87360fb23fd4d80ec7a ("kube-apiserver-Contolplane-1.dc.example.com_kube-system(5077e0841ae3e87360fb23fd4d80ec7a)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "back-off 2m40s restarting failed container=kube-apiserver pod=kube-apiserver-Contolplane-1.dc.example.com_kube-system(5077e0841ae3e87360fb23fd4d80ec7a)" Jun 21 05:46:16 Contolplane-1.dc.example.com kubelet[921]: E0621 05:46:16.916205 921 kubelet_node_status.go:402] Error updating node status, will retry: error getting node "Contolplane-1.dc.example.com": Get https://LOADBALANCER-IP:6443/api/v1/nodes/Contolplane-1.dc.example.com?timeout=10s: context deadline exceeded Jun 21 05:46:21 Contolplane-1.dc.example.com kubelet[921]: E0621 05:46:21.064634 921 controller.go:136] failed to ensure node lease exists, will retry in 7s, error: Get https://LOADBALANCER-IP:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/Contolplane-1.dc.example.com?timeout=10s: context deadline exceeded ^C

neolit123 commented 4 years ago

note, we also have this guide for settuping a load-balancer: https://github.com/kubernetes/kubeadm/blob/master/docs/ha-considerations.md

pod_workers.go:191] Error syncing pod 5077e0841ae3e87360fb23fd4d80ec7a ("kube-apiserver-Contolplane-1.dc.example.com_kube-system(5077e0841ae3e87360fb23fd4d80ec7a)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "back-off 2m40s restarting failed

this tells me that one of your kube-apiservers is failing to start.

Sjnahak commented 4 years ago

@neolit123 Yes my issue is related to kubeapiserver stops it's process when i take down 2 controlplane node out of 3.

Here is the output you requested when all controlplane nodes are running.

[root@Controlpaln-1]# kubectl get pods -A
NAMESPACE              NAME                                                 READY   STATUS      RESTARTS   AGE
ci-cd                  jenkins-builds-b955c4849-2jbnr                       1/1     Running     1          11d
default                gerrit-55c5b7549c-zt2jd                              1/1     Running     1          9d
default                mysql-8ff8b8c4c-487mq                                1/1     Running     1          10d
default                nginx-deploy-blue-76fbc56598-p2tm2                   1/1     Running     0          3d17h
default                squid-hdhps                                          1/1     Running     1          3d12h
default                squid-mhhc6                                          1/1     Running     1          3d12h
default                squid-s8s9v                                          1/1     Running     0          3d12h
ingress-controller     haproxy-ingress-7q4kt                                1/1     Running     4          11d
ingress-controller     haproxy-ingress-dqhpj                                1/1     Running     10         12d
ingress-controller     haproxy-ingress-lgxxm                                1/1     Running     2          13d
ingress-controller     haproxy-ingress-r6n6w                                1/1     Running     8          12d
ingress-controller     ingress-default-backend-69b79cd689-b72kz             1/1     Running     1          13d
kube-system            backup-1592614800-tcwms                              0/1     Completed   0          2d4h
kube-system            backup-1592701200-4mfds                              0/1     Completed   0          28h
kube-system            backup-1592787600-nstwl                              0/1     Completed   0          4h22m
kube-system            backup-2-1592618400-jkg49                            0/1     Completed   0          2d3h
kube-system            backup-2-1592704800-rfcj6                            0/1     Completed   0          27h
kube-system            backup-2-1592791200-mf6q8                            0/1     Completed   0          3h22m
kube-system            coredns-66bff467f8-4kd2q                             1/1     Running     2          16d
kube-system            coredns-66bff467f8-hbsz9                             1/1     Running     7          13d
kube-system            etcd-Controlpaln-1.dc.example.com                      1/1     Running     139        17d
kube-system            etcd-Controlpalne-2.dc.example.com                      1/1     Running     12         17d
kube-system            etcd-Controlpalne-3.dc.example.com                      1/1     Running     30         17d
kube-system            kube-apiserver-Controlpaln-1.dc.example.com            1/1     Running     157        17d
kube-system            kube-apiserver-Controlpalne-2.dc.example.com            1/1     Running     14         17d
kube-system            kube-apiserver-Controlpalne-3.dc.example.com            1/1     Running     33         17d
kube-system            kube-controller-manager-Controlpaln-1.dc.example.com   1/1     Running     12         17d
kube-system            kube-controller-manager-Controlpalne-2.dc.example.com   1/1     Running     17         17d
kube-system            kube-controller-manager-Controlpalne-3.dc.example.com   1/1     Running     16         17d
kube-system            kube-flannel-ds-amd64-bhx6x                          1/1     Running     2          17d
kube-system            kube-flannel-ds-amd64-mpt4c                          1/1     Running     8          17d
kube-system            kube-flannel-ds-amd64-r99cp                          1/1     Running     13         17d
kube-system            kube-flannel-ds-amd64-sdwlk                          1/1     Running     10         17d
kube-system            kube-proxy-4gtv6                                     1/1     Running     10         17d
kube-system            kube-proxy-78wzt                                     1/1     Running     9          17d
kube-system            kube-proxy-drwc6                                     1/1     Running     6          17d
kube-system            kube-proxy-mhr2f                                     1/1     Running     2          17d
kube-system            kube-scheduler-Controlpaln-1.dc.example.com            1/1     Running     17         17d
kube-system            kube-scheduler-Controlpalne-2.dc.example.com            1/1     Running     16         17d
kube-system            kube-scheduler-Controlpalne-3.dc.example.com            1/1     Running     14         17d
kubernetes-dashboard   dashboard-metrics-scraper-6b4884c9d5-w6h6d           1/1     Running     0          5d18h
kubernetes-dashboard   kubernetes-dashboard-7b544877d5-252px                1/1     Running     0          2d18h

CURL OUTPUT

[root@controlplan-1]# curl  https://localhost:6443/version -k
{
  "major": "1",
  "minor": "18",
  "gitVersion": "v1.18.0",
  "gitCommit": "XXXXXXXXXXXXXXXXXXX",
  "gitTreeState": "clean",
  "buildDate": "2020-03-25T14:50:46Z",
  "goVersion": "go1.13.8",
  "compiler": "gc",
  "platform": "linux/amd64"
}
fabriziopandini commented 4 years ago

WRT to API server not responding most of the problem I have faced in the past are due to load balancer not kicking out failed instances from the pool timely. This usually depends by liveness problem configuration.

Also, If you take down 2 masters out out of three you most probably loose quorum and etcd goes in read only mode

neolit123 commented 4 years ago

what do you see in the container logs (e.g. docker logs...) for apiserver and etcd on the only remaining master?

Sjnahak commented 4 years ago

@neolit123 I have shared earlier in my previous comments , Starting with "Yes i used ........"

neolit123 commented 4 years ago

Is it the Expected Behaviour when we take 2 kubernetes master node down out of 3

no, it is not.

^ so i like to correct myself here. it is in fact expected behavior, sorry for misreading the overall problem.

if you:

now, if you remove a single CP you should see something in the lines of:

see this page https://etcd.io/docs/v3.2.17/faq/ about failure tollerences based on the number of nodes. for 3 you have 1. for 5 you have 2.

there isn't much kubeadm can do about this behavior.

does this explain the problem to you?

Sjnahak commented 4 years ago

@neolit123 thanks for confirmation. I just wanted to be double sure . Closing the Issue.