Closed krasimirdermendzhiev closed 3 years ago
I'm trying to fix the problem and when I configure ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
corefile: |
rewrite name etcd-0.etcd kubernetes.default.svc.cluster.local.
I don't have logs
2021-08-07 08:52:42.486079 I | embed: advertise client URLs = https://etcd-0.etcd:2379
{"level":"warn","ts":1628152774.606514,"caller":"netutil/netutil.go:121","msg":"failed to resolve URL Host","url":"https://etcd-0.etcd:2380","host":"etcd-0.etcd:2380","retry-interval":1,"error":"lookup etcd-0.etcd on xxx.xxx.xxx.10:53: no such host"}
in my etcd pods but stil have problem when I try to create I have message: cannot find sts/etcd in ns default-8e1cb1-vc-sample-1: default-8e1cb1-vc-sample-1/etcd is not ready in 120 seconds
This is my logs from the pod:
kubectl logs pod/etcd-0 -n default-8e1cb1-vc-sample-1
[WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead
2021-08-09 06:03:18.695492 I | etcdmain: etcd Version: 3.4.0
2021-08-09 06:03:18.695537 I | etcdmain: Git SHA: 898bd1351
2021-08-09 06:03:18.695541 I | etcdmain: Go Version: go1.12.9
2021-08-09 06:03:18.695545 I | etcdmain: Go OS/Arch: linux/amd64
2021-08-09 06:03:18.695549 I | etcdmain: setting maximum number of CPUs to 2, total number of available CPUs is 2
[WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead
2021-08-09 06:03:18.695611 I | embed: peerTLS: cert = /etc/kubernetes/pki/etcd/tls.crt, key = /etc/kubernetes/pki/etcd/tls.key, trusted-ca = /etc/kubernetes/pki/root/tls.crt, client-cert-auth = true, crl-file =
2021-08-09 06:03:18.696125 I | embed: name = etcd-0
2021-08-09 06:03:18.696136 I | embed: data dir = /var/lib/etcd/data
2021-08-09 06:03:18.696141 I | embed: member dir = /var/lib/etcd/data/member
2021-08-09 06:03:18.696144 I | embed: heartbeat = 100ms
2021-08-09 06:03:18.696148 I | embed: election = 1000ms
2021-08-09 06:03:18.696152 I | embed: snapshot count = 100000
2021-08-09 06:03:18.696160 I | embed: advertise client URLs = https://etcd-0.etcd:2379
{"level":"info","ts":1628488998.7060077,"caller":"netutil/netutil.go:112","msg":"resolved URL Host","url":"https://etcd-0.etcd:2380","host":"etcd-0.etcd:2380","resolved-addr":"xxx.xxx.xxx.1:2380"}
{"level":"info","ts":1628488998.7071345,"caller":"netutil/netutil.go:112","msg":"resolved URL Host","url":"https://etcd-0.etcd:2380","host":"etcd-0.etcd:2380","resolved-addr":"xxx.xxx.xxx.1:2380"}
2021-08-09 06:03:18.711994 I | etcdserver: starting member 1252090b999e74b4 in cluster e47539242bb46ea
raft2021/08/09 06:03:18 INFO: 1252090b999e74b4 switched to configuration voters=()
raft2021/08/09 06:03:18 INFO: 1252090b999e74b4 became follower at term 0
raft2021/08/09 06:03:18 INFO: newRaft 1252090b999e74b4 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
raft2021/08/09 06:03:18 INFO: 1252090b999e74b4 became follower at term 1
raft2021/08/09 06:03:18 INFO: 1252090b999e74b4 switched to configuration voters=(1320127586199565492)
2021-08-09 06:03:18.716165 W | auth: simple token is not cryptographically signed
2021-08-09 06:03:18.719014 I | etcdserver: starting server... [version: 3.4.0, cluster version: to_be_decided]
2021-08-09 06:03:18.719883 I | etcdserver: 1252090b999e74b4 as single-node; fast-forwarding 9 ticks (election ticks 10)
raft2021/08/09 06:03:18 INFO: 1252090b999e74b4 switched to configuration voters=(1320127586199565492)
2021-08-09 06:03:18.720543 I | etcdserver/membership: added member 1252090b999e74b4 [https://etcd-0.etcd:2380] to cluster e47539242bb46ea
2021-08-09 06:03:18.721387 I | embed: ClientTLS: cert = /etc/kubernetes/pki/etcd/tls.crt, key = /etc/kubernetes/pki/etcd/tls.key, trusted-ca = /etc/kubernetes/pki/root/tls.crt, client-cert-auth = true, crl-file =
2021-08-09 06:03:18.721504 I | embed: listening for peers on [::]:2380
raft2021/08/09 06:03:19 INFO: 1252090b999e74b4 is starting a new election at term 1
raft2021/08/09 06:03:19 INFO: 1252090b999e74b4 became candidate at term 2
raft2021/08/09 06:03:19 INFO: 1252090b999e74b4 received MsgVoteResp from 1252090b999e74b4 at term 2
raft2021/08/09 06:03:19 INFO: 1252090b999e74b4 became leader at term 2
raft2021/08/09 06:03:19 INFO: raft.node: 1252090b999e74b4 elected leader 1252090b999e74b4 at term 2
2021-08-09 06:03:19.313001 I | etcdserver: setting up the initial cluster version to 3.4
2021-08-09 06:03:19.313587 N | etcdserver/membership: set the initial cluster version to 3.4
2021-08-09 06:03:19.313635 I | etcdserver/api: enabled capabilities for version 3.4
2021-08-09 06:03:19.313650 I | embed: ready to serve client requests
2021-08-09 06:03:19.313676 I | etcdserver: published {Name:etcd-0 ClientURLs:[https://etcd-0.etcd:2379]} to cluster e47539242bb46ea
2021-08-09 06:03:19.314749 I | embed: serving client requests on [::]:2379
I stopped here
Warning Unhealthy 1s kubelet Readiness probe failed: {"level":"warn","ts":"2021-08-09T08:17:15.307Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-7acea29d-a801-4282-8894-ef334a299146/etcd:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
https://etcd:2379 is unhealthy: failed to commit proposal: context deadline exceeded
Error: unhealthy cluster
Can you list everything in default-0994e6-vc-sample-1 namespace? May sure you have configured a headless Service for etcd Pod.
This is my local setup:
kubectl get all -n tenant1admin-f7ea3a-vc-sample-1
NAME READY STATUS RESTARTS AGE
pod/apiserver-0 1/1 Running 0 86d
pod/controller-manager-0 1/1 Running 0 86d
pod/etcd-0 1/1 Running 0 86d
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/apiserver-svc NodePort 10.98.56.6 <none> 6443:30015/TCP 86d
service/etcd ClusterIP None <none> <none> 86d
NAME READY AGE
statefulset.apps/apiserver 1/1 86d
statefulset.apps/controller-manager 1/1 86d
statefulset.apps/etcd 1/1 86d
I am not sure if you can hardcode the url to "localhost:2379" as a temporal workaround. @charleszheng44 may provide more ideas.
Can you list everything in default-0994e6-vc-sample-1 namespace? May sure you have configured a headless Service for etcd Pod.
This is my local setup:
kubectl get all -n tenant1admin-f7ea3a-vc-sample-1 NAME READY STATUS RESTARTS AGE pod/apiserver-0 1/1 Running 0 86d pod/controller-manager-0 1/1 Running 0 86d pod/etcd-0 1/1 Running 0 86d NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/apiserver-svc NodePort 10.98.56.6 <none> 6443:30015/TCP 86d service/etcd ClusterIP None <none> <none> 86d NAME READY AGE statefulset.apps/apiserver 1/1 86d statefulset.apps/controller-manager 1/1 86d statefulset.apps/etcd 1/1 86d
kubectl get all -n default-080686-vc-sample-1 NAME READY STATUS RESTARTS AGE pod/etcd-0 0/1 Running 1 3m1s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/etcd ClusterIP None
NAME READY AGE statefulset.apps/etcd 0/1 14h
This makes me think it is core dns problem. Can you kubectl exec
into the etcd Pod and check if the dns service in the super cluster is actually working?
kubectl exec -it pod/etcd-0 -n default-5e4fe0-vc-sample-1 -- sh
/ # cat /etc/resolv.conf
nameserver xxx.xxx.xxx.10
search default-5e4fe0-vc-sample-1.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
kubectl get all -n kube-system | grep coredns
pod/coredns-688ff95595-c8fmk 1/1 Running 0 62m
pod/coredns-688ff95595-ln4nv 1/1 Running 0 27m
deployment.apps/coredns 2/2 2 2 14h
replicaset.apps/coredns-688ff95595 2 2 2 6h48m
I am not sure if you can hardcode the url to "localhost:2379" as a temporal workaround. @charleszheng44 may provide more ideas.
I tried with 127.0.0.1 for
--advertise-client-urls=https://127.0.0.1:2379
--endpoints=https://127.0.0.1:2379
and the etcd is created!
Hi @krasimirdermendzhiev, I used to run into the same problem, but I can't remember what is the root cause. I guess the problem is caused by the super master version(1.19), downgrading to 1.18 should make things work. If my memory is correct, the problem happened because the Etcd service (etcd-0.etcd
) is not accessible before the Etcd pod is ready, while the Etcd pod itself needs to visit the Etcd service.
Hi @krasimirdermendzhiev, I used to run into the same problem, but I can't remember what is the root cause. I guess the problem is caused by the super master version(1.19), downgrading to 1.18 should make things work. If my memory is correct, the problem happened because the Etcd service (
etcd-0.etcd
) is not accessible before the Etcd pod is ready, while the Etcd pod itself needs to visit the Etcd service.
Yes I think the same like you "the problem happened because the Etcd service (etcd-0.etcd
) is not accessible before the Etcd pod is ready, while the Etcd pod itself needs to visit the Etcd service."
Thank you, folks! P.S. I will close and if somebody need it can open it.
Hi, @krasimirdermendzhiev @Fei-Guo @charleszheng44
I ran into the same issue with rancher Kubernetes v1.21
, coredns
, and calico
. I tried hardcoding the URLs to localhost
and I get the same error in the container logs:
"netutil/netutil.go:121","msg":"failed to resolve URL Host"
PS: I can't exec into the container as it doesn't start.
Thank you
Hello, folks
I have problem with etcd when try to create tenant master. I'm following this file https://github.com/kubernetes-sigs/cluster-api-provider-nested/blob/main/virtualcluster/doc/demo.md
I try on kubernetes cluster version 1.19.12
kubectl get ns
kubectl get all -n vc-manager
kubectl get clusterversion
kubectl get VirtualCluster
I decided to deploy dnsutil and try to find wher is the problem and I can see the pod is normal in the same tenant master namespace.
kubectl exec -i -t dnsutils -n default-0994e6-vc-sample-1 -- nslookup kubernetes.default
kubectl exec -i -t dnsutils -n default-0994e6-vc-sample-1 -- cat /etc/resolv.conf
kubectl logs pod/etcd-0 -n default-0994e6-vc-sample-1
Coredns Log
[INFO] "A IN etcd-0.etcd. udp 29 false 512" NXDOMAIN qr,rd,ra 29 0.000281613s