k3s-io / k3s

Lightweight Kubernetes
https://k3s.io
Apache License 2.0
27.57k stars 2.31k forks source link

Unable to join master, etcd cluster join failure #9227

Closed bmorris53 closed 8 months ago

bmorris53 commented 8 months ago

Environmental Info: K3s Version:

$ k3s -v
k3s version v1.28.5+k3s1 (5b2d1271)
go version go1.20.12

Node(s) CPU architecture, OS, and Version:

$ uname -a
Linux coupa-k3s-master-1.dev.chtrse.com 5.14.0-362.8.1.el9_3.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Nov 8 17:36:32 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

$ cat /etc/redhat-release
Rocky Linux release 9.3 (Blue Onyx)

Cluster Configuration: Attempting to setup a 3-master HA system with embedded etcd. Running in Openstack. Dual stack configuration. Security rules updated to allow ALL IPv4 and IPv6 traffic to communicate with all nodes within the security group. SELinux is enabled but set to permissive.

Describe the bug: The first node with cluster-init starts without issue. Kube api server reports healthy, as does the etcd instance that is running:

Initial Cluster Init:

$ curl -sfL https://get.k3s.io | sh -s server --cluster-init --token token --node-ip 172.20.0.161,<v6 IP redacted> --cluster-cidr 10.42.0.0/16,2001:cafe:42:0::/56 --service-cidr 10.43.0.0/16,2001:cafe:42:1::/112 --write-kubeconfig-mode 664 --flannel-ipv6-masq

$ kubectl get nodes -o wide
NAME           STATUS   ROLES                       AGE   VERSION        INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                      KERNEL-VERSION                CONTAINER-RUNTIME
k3s-master-1   Ready    control-plane,etcd,master   22h   v1.28.5+k3s1   172.20.0.161   <none>        Rocky Linux 9.3 (Blue Onyx)   5.14.0-362.8.1.el9_3.x86_64   containerd://1.7.11-k3s2

$ sudo /usr/local/bin/etcdctl --cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt --cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt --key=/var/lib/rancher/k3s/server/tls/etcd/client.key --endpoints=127.0.0.1:2379,172.20.0.161:2379 endpoint health
172.20.0.161:2379 is healthy: successfully committed proposal: took = 6.276927ms
127.0.0.1:2379 is healthy: successfully committed proposal: took = 6.061308ms

Steps To Reproduce: On second node:

$ curl -sfL https://get.k3s.io | sh -s - server --token token --server https://172.20.0.161:6443 --node-ip 172.20.0.142,<v6 IP redacted> --cluster-cidr 10.42.0.0/16,2001:cafe:42:0::/56 --service-cidr 10.43.0.0/16,2001:cafe:42:1::/112 --write-kubeconfig-mode 664 --flannel-ipv6-masq

Journal log produces the following:

Jan 12 18:36:30 k3s-master-2 systemd[1]: Starting Lightweight Kubernetes...
Jan 12 18:36:30 k3s-master-2 sh[256815]: + /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service
Jan 12 18:36:30 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:30Z" level=info msg="Starting k3s v1.28.5+k3s1 (5b2d1271)"
Jan 12 18:36:31 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:31Z" level=warning msg="Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash. Use the full token from the server's node-token file to enable Cluster CA validation."
Jan 12 18:36:31 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:31Z" level=info msg="Managed etcd cluster not yet initialized"
Jan 12 18:36:31 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:31Z" level=warning msg="Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash. Use the full token from the server's node-token file to enable Cluster CA validation."
Jan 12 18:36:31 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:31Z" level=info msg="Reconciling bootstrap data between datastore and disk"
Jan 12 18:36:31 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:31Z" level=info msg=start
Jan 12 18:36:31 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:31Z" level=info msg="schedule, now=2024-01-12T18:36:31Z, entry=1, next=2024-01-13T00:00:00Z"
Jan 12 18:36:31 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:31Z" level=info msg="Running kube-apiserver --advertise-address=172.20.0.142 --advertise-port=6443 --allow-privileged=true --anonymous-auth=false --api-audiences=https://kubernetes.default.svc.cluster.local,k3s --authorization-mode=Node,RBAC --bind-address=127.0.0.1 --cert-dir=/var/lib/rancher/k3s/server/tls/temporary-certs --client-ca-file=/var/lib/rancher/k3s/server/tls/client-ca.crt --egress-selector-config-file=/var/lib/rancher/k3s/server/etc/egress-selector-config.yaml --enable-admission-plugins=NodeRestriction --enable-aggregator-routing=true --enable-bootstrap-token-auth=true --etcd-cafile=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt --etcd-certfile=/var/lib/rancher/k3s/server/tls/etcd/client.crt --etcd-keyfile=/var/lib/rancher/k3s/server/tls/etcd/client.key --etcd-servers=https://127.0.0.1:2379 --feature-gates=JobTrackingWithFinalizers=true --kubelet-certificate-authority=/var/lib/rancher/k3s/server/tls/server-ca.crt --kubelet-client-certificate=/var/lib/rancher/k3s/server/tls/client-kube-apiserver.crt --kubelet-client-key=/var/lib/rancher/k3s/server/tls/client-kube-apiserver.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --profiling=false --proxy-client-cert-file=/var/lib/rancher/k3s/server/tls/client-auth-proxy.crt --proxy-client-key-file=/var/lib/rancher/k3s/server/tls/client-auth-proxy.key --requestheader-allowed-names=system:auth-proxy --requestheader-client-ca-file=/var/lib/rancher/k3s/server/tls/request-header-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6444 --service-account-issuer=https://kubernetes.default.svc.cluster.local --service-account-key-file=/var/lib/rancher/k3s/server/tls/service.key --service-account-signing-key-file=/var/lib/rancher/k3s/server/tls/service.current.key --service-cluster-ip-range=10.43.0.0/16,2001:cafe:42:1::/112 --service-node-port-range=30000-32767 --storage-backend=etcd3 --tls-cert-file=/var/lib/rancher/k3s/server/tls/serving-kube-apiserver.crt --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305 --tls-private-key-file=/var/lib/rancher/k3s/server/tls/serving-kube-apiserver.key"
Jan 12 18:36:31 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:31Z" level=info msg="Running kube-scheduler --authentication-kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --authorization-kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --bind-address=127.0.0.1 --kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --profiling=false --secure-port=10259"
Jan 12 18:36:31 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:31Z" level=info msg="Running kube-controller-manager --allocate-node-cidrs=true --authentication-kubeconfig=/var/lib/rancher/k3s/server/cred/controller.kubeconfig --authorization-kubeconfig=/var/lib/rancher/k3s/server/cred/controller.kubeconfig --bind-address=127.0.0.1 --cluster-cidr=10.42.0.0/16,2001:cafe:42::/56 --cluster-signing-kube-apiserver-client-cert-file=/var/lib/rancher/k3s/server/tls/client-ca.nochain.crt --cluster-signing-kube-apiserver-client-key-file=/var/lib/rancher/k3s/server/tls/client-ca.key --cluster-signing-kubelet-client-cert-file=/var/lib/rancher/k3s/server/tls/client-ca.nochain.crt --cluster-signing-kubelet-client-key-file=/var/lib/rancher/k3s/server/tls/client-ca.key --cluster-signing-kubelet-serving-cert-file=/var/lib/rancher/k3s/server/tls/server-ca.nochain.crt --cluster-signing-kubelet-serving-key-file=/var/lib/rancher/k3s/server/tls/server-ca.key --cluster-signing-legacy-unknown-cert-file=/var/lib/rancher/k3s/server/tls/server-ca.nochain.crt --cluster-signing-legacy-unknown-key-file=/var/lib/rancher/k3s/server/tls/server-ca.key --configure-cloud-routes=false --controllers=*,tokencleaner,-service,-route,-cloud-node-lifecycle --feature-gates=JobTrackingWithFinalizers=true --kubeconfig=/var/lib/rancher/k3s/server/cred/controller.kubeconfig --profiling=false --root-ca-file=/var/lib/rancher/k3s/server/tls/server-ca.crt --secure-port=10257 --service-account-private-key-file=/var/lib/rancher/k3s/server/tls/service.current.key --service-cluster-ip-range=10.43.0.0/16,2001:cafe:42:1::/112 --use-service-account-credentials=true"
Jan 12 18:36:31 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:31Z" level=info msg="Running cloud-controller-manager --allocate-node-cidrs=true --authentication-kubeconfig=/var/lib/rancher/k3s/server/cred/cloud-controller.kubeconfig --authorization-kubeconfig=/var/lib/rancher/k3s/server/cred/cloud-controller.kubeconfig --bind-address=127.0.0.1 --cloud-config=/var/lib/rancher/k3s/server/etc/cloud-config.yaml --cloud-provider=k3s --cluster-cidr=10.42.0.0/16,2001:cafe:42::/56 --configure-cloud-routes=false --controllers=*,-route --feature-gates=CloudDualStackNodeIPs=true --kubeconfig=/var/lib/rancher/k3s/server/cred/cloud-controller.kubeconfig --leader-elect-resource-name=k3s-cloud-controller-manager --node-status-update-frequency=1m0s --profiling=false"
Jan 12 18:36:31 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:31Z" level=info msg="Server node token is available at /var/lib/rancher/k3s/server/token"
Jan 12 18:36:31 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:31Z" level=info msg="To join server node to cluster: k3s server -s https://172.20.0.142:6443 -t ${SERVER_NODE_TOKEN}"
Jan 12 18:36:31 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:31Z" level=info msg="Agent node token is available at /var/lib/rancher/k3s/server/agent-token"
Jan 12 18:36:31 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:31Z" level=info msg="To join agent node to cluster: k3s agent -s https://172.20.0.142:6443 -t ${AGENT_NODE_TOKEN}"
Jan 12 18:36:31 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:31Z" level=info msg="Wrote kubeconfig /etc/rancher/k3s/k3s.yaml"
Jan 12 18:36:31 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:31Z" level=info msg="Run: k3s kubectl"
Jan 12 18:36:32 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:32Z" level=info msg="Password verified locally for node k3s-master-2"
Jan 12 18:36:32 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:32Z" level=info msg="certificate CN=k3s-master-2 signed by CN=k3s-server-ca@1705003371: notBefore=2024-01-11 20:02:51 +0000 UTC notAfter=2025-01-11 18:36:32 +0000 UTC"
Jan 12 18:36:32 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:32Z" level=info msg="certificate CN=system:node:k3s-master-2,O=system:nodes signed by CN=k3s-client-ca@1705003371: notBefore=2024-01-11 20:02:51 +0000 UTC notAfter=2025-01-11 18:36:32 +0000 UTC"
Jan 12 18:36:32 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:32Z" level=info msg="Module overlay was already loaded"
Jan 12 18:36:32 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:32Z" level=info msg="Module nf_conntrack was already loaded"
Jan 12 18:36:32 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:32Z" level=info msg="Module br_netfilter was already loaded"
Jan 12 18:36:32 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:32Z" level=info msg="Module iptable_nat was already loaded"
Jan 12 18:36:32 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:32Z" level=info msg="Module iptable_filter was already loaded"
Jan 12 18:36:32 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:32Z" level=info msg="Module ip6table_nat was already loaded"
Jan 12 18:36:32 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:32Z" level=info msg="Module ip6table_filter was already loaded"
Jan 12 18:36:32 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:32Z" level=warning msg="SELinux is enabled on this host, but k3s has not been started with --selinux - containerd SELinux support is disabled"
Jan 12 18:36:32 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:32Z" level=info msg="Logging containerd to /var/lib/rancher/k3s/agent/containerd/containerd.log"
Jan 12 18:36:32 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:32Z" level=info msg="Running containerd -c /var/lib/rancher/k3s/agent/etc/containerd/config.toml -a /run/k3s/containerd/containerd.sock --state /run/k3s/containerd --root /var/lib/rancher/k3s/agent/containerd"
Jan 12 18:36:33 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:33Z" level=info msg="containerd is now running"
Jan 12 18:36:33 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:33Z" level=info msg="Connecting to proxy" url="wss://127.0.0.1:6443/v1-k3s/connect"
Jan 12 18:36:33 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:33Z" level=info msg="Running kubelet --address=0.0.0.0 --allowed-unsafe-sysctls=net.ipv4.ip_forward,net.ipv6.conf.all.forwarding --anonymous-auth=false --authentication-token-webhook=true --authorization-mode=Webhook --cgroup-driver=systemd --client-ca-file=/var/lib/rancher/k3s/agent/client-ca.crt --cloud-provider=external --cluster-dns=10.43.0.10,2001:cafe:42:1::a --cluster-domain=cluster.local --container-runtime-endpoint=unix:///run/k3s/containerd/containerd.sock --containerd=/run/k3s/containerd/containerd.sock --eviction-hard=imagefs.available<5%,nodefs.available<5% --eviction-minimum-reclaim=imagefs.available=10%,nodefs.available=10% --fail-swap-on=false --feature-gates=CloudDualStackNodeIPs=true --healthz-bind-address=127.0.0.1 --hostname-override=k3s-master-2 --kubeconfig=/var/lib/rancher/k3s/agent/kubelet.kubeconfig --node-ip=172.20.0.142,<redacted> --node-labels= --pod-infra-container-image=rancher/mirrored-pause:3.6 --pod-manifest-path=/var/lib/rancher/k3s/agent/pod-manifests --read-only-port=0 --resolv-conf=/etc/resolv.conf --serialize-image-pulls=false --tls-cert-file=/var/lib/rancher/k3s/agent/serving-kubelet.crt --tls-private-key-file=/var/lib/rancher/k3s/agent/serving-kubelet.key"
Jan 12 18:36:33 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:33Z" level=info msg="Handling backend connection request [k3s-master-2]"
Jan 12 18:36:33 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:33Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:6443/v1-k3s/readyz: 500 Internal Server Error"
Jan 12 18:36:35 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:35Z" level=info msg="Adding member k3s-master-2-70e3b748=https://172.20.0.142:2380 to etcd cluster [coupa-k3s-master-1.dev.chtrse.com-c9bb05ae=https://172.20.0.161:2380]"
Jan 12 18:36:38 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:38Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:6443/v1-k3s/readyz: 500 Internal Server Error"
Jan 12 18:36:43 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:43Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:6443/v1-k3s/readyz: 500 Internal Server Error"
Jan 12 18:36:46 k3s-master-2 k3s[256819]: {"level":"warn","ts":"2024-01-12T18:36:46.212776Z","logger":"etcd-client","caller":"v3@v3.5.9-k3s1/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000678380/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing: dial tcp 127.0.0.1:2379: connect: connection refused\""}
Jan 12 18:36:46 k3s-master-2 k3s[256819]: {"level":"info","ts":"2024-01-12T18:36:46.213471Z","logger":"etcd-client","caller":"v3@v3.5.9-k3s1/client.go:210","msg":"Auto sync endpoints failed.","error":"context deadline exceeded"}
Jan 12 18:36:48 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:48Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:6443/v1-k3s/readyz: 500 Internal Server Error"
Jan 12 18:36:50 k3s-master-2 k3s[256819]: {"level":"warn","ts":"2024-01-12T18:36:50.529306Z","logger":"etcd-client","caller":"v3@v3.5.9-k3s1/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0012be000/172.20.0.161:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing: failed to do connect handshake, response: \\\"HTTP/1.1 403 Forbidden\\\\r\\\\nContent-Length: 3478\\\\r\\\\nConnection: keep-alive\\\\r\\\\nContent-Language: en\\\\r\\\\nContent-Type: text/html;charset=utf-8\\\\r\\\\nDate: Fri, 12 Jan 2024 18:36:45 GMT\\\\r\\\\nMime-Version: 1.0\\\\r\\\\nServer: squid/3.5.27\\\\r\\\\nVary: Accept-Language\\\\r\\\\nVia: 1.1 7491d61a006d (squid/3.5.27)\\\\r\\\\nX-Cache: MISS from 7491d61a006d\\\\r\\\\nX-Cache-Lookup: NONE from 7491d61a006d:3128\\\\r\\\\nX-Squid-Error: ERR_ACCESS_DENIED 0\\\\r\\\\n\\\\r\\\\n<!DOCTYPE html PUBLIC \\\\\\\"-//W3C//DTD HTML 4.01//EN\\\\\\\" \\\\\\\"http://www.w3.org/TR/html4/strict.dtd\\\\\\\">\\\\n<html><head>\\\\n<meta type=\\\\\\\"copyright\\\\\\\" content=\\\\\\\"Copyright (C) 1996-2017 The Squid Software Foundation and contributors\\\\\\\">\\\\n<meta http-equiv=\\\\\\\"Content-Type\\\\\\\" content=\\\\\\\"text/html; charset=utf-8\\\\\\\">\\\\n<title>ERROR: The requested URL could not be retrieved</title>\\\\n<style type=\\\\\\\"text/css\\\\\\\"><!--\\\\n /*\\\\n * Copyright (C) 1996-2017 The Squid Software Foundation and contributors\\\\n *\\\\n * Squid software is distributed under GPLv2+ license and includes\\\\n * contributions from numerous individuals and organizations.\\\\n * Please see the COPYING and CONTRIBUTORS files for details.\\\\n */\\\\n\\\\n/*\\\\n Stylesheet for Squid Error pages\\\\n Adapted from design by Free CSS Templates\\\\n http://www.freecsstemplates.org\\\\n Released for free under a Creative Commons Attribution 2.5 License\\\\n*/\\\\n\\\\n/* Page basics */\\\\n* {\\\\n\\\\tfont-family: verdana, sans-serif;\\\\n}\\\\n\\\\nhtml body {\\\\n\\\\tmargin: 0;\\\\n\\\\tpadding: 0;\\\\n\\\\tbackground: #efefef;\\\\n\\\\tfont-size: 12px;\\\\n\\\\tcolor: #1e1e1e;\\\\n}\\\\n\\\\n/* Page displayed title area */\\\\n#titles {\\\\n\\\\tmargin-left: 15px;\\\\n\\\\tpadding: 10px;\\\\n\\\\tpadding-left: 100px;\\\\n\\\\tbackground: url('/squid-internal-static/icons/SN.png') no-repeat left;\\\\n}\\\\n\\\\n/* initial title */\\\\n#titles h1 {\\\\n\\\\tcolor: #000000;\\\\n}\\\\n#titles h2 {\\\\n\\\\tcolor: #000000;\\\\n}\\\\n\\\\n/* special event: FTP success page titles */\\\\n#titles ftpsuccess {\\\\n\\\\tbackground-color:#00ff00;\\\\n\\\\twidth:100%;\\\\n}\\\\n\\\\n/* Page displayed body content area */\\\\n#content {\\\\n\\\\tpadding: 10px;\\\\n\\\\tbackground: #ffffff;\\\\n}\\\\n\\\\n/* General text */\\\\np {\\\\n}\\\\n\\\\n/* error brief description */\\\\n#error p {\\\\n}\\\\n\\\\n/* some data which may have caused the problem */\\\\n#data {\\\\n}\\\\n\\\\n/* the error message received from the system or other software */\\\\n#sysmsg {\\\\n}\\\\n\\\\npre {\\\\n    font-family:sans-serif;\\\\n}\\\\n\\\\n/* special event: FTP / Gopher directory listing */\\\\n#dirmsg {\\\\n    font-family: courier;\\\\n    color: black;\\\\n    font-size: 10pt;\\\\n}\\\\n#dirlisting {\\\\n    margin-left: 2%;\\\\n    margin-right: 2%;\\\\n}\\\\n#dirlisting tr.entry td.icon,td.filename,td.size,td.date {\\\\n    border-bottom: groove;\\\\n}\\\\n#dirlisting td.size {\\\\n    width: 50px;\\\\n    text-align: right;\\\\n    padding-right: 5px;\\\\n}\\\\n\\\\n/* horizontal lines */\\\\nhr {\\\\n\\\\tmargin: 0;\\\\n}\\\\n\\\\n/* page displayed footer area */\\\\n#footer {\\\\n\\\\tfont-size: 9px;\\\\n\\\\tpadding-left: 10px;\\\\n}\\\\n\\\\n\\\\nbody\\\\n:lang(fa) { direction: rtl; font-size: 100%; font-family: Tahoma, Roya, sans-serif; float: right; }\\\\n:lang(he) { direction: rtl; }\\\\n --></style>\\\\n</head><body id=ERR_ACCESS_DENIED>\\\\n<div id=\\\\\\\"titles\\\\\\\">\\\\n<h1>ERROR</h1>\\\\n<h2>The requested URL could not be retrieved</h2>\\\\n</div>\\\\n<hr>\\\\n\\\\n<div id=\\\\\\\"content\\\\\\\">\\\\n<p>The following error was encountered while trying to retrieve the URL: <a href=\\\\\\\"172.20.0.161:2379\\\\\\\">172.20.0.161:2379</a></p>\\\\n\\\\n<blockquote id=\\\\\\\"error\\\\\\\">\\\\n<p><b>Access Denied.</b></p>\\\\n</blockquote>\\\\n\\\\n<p>Access control configuration prevents your request from being allowed at this time. Please contact your service provider if you feel this is incorrect.</p>\\\\n\\\\n<p>Your cache administrator is <a href=\\\\\\\"mailto:webmaster?subject=CacheErrorInfo%20-%20ERR_ACCESS_DENIED&amp;body=CacheHost%3A%207491d61a006d%0D%0AErrPage%3A%20ERR_ACCESS_DENIED%0D%0AErr%3A%20%5Bnone%5D%0D%0ATimeStamp%3A%20Fri,%2012%20Jan%202024%2018%3A36%3A45%20GMT%0D%0A%0D%0AClientIP%3A%2047.43.111.84%0D%0A%0D%0AHTTP%20Request%3A%0D%0ACONNECT%20%2F%20HTTP%2F1.1%0AUser-Agent%3A%20grpc-go%2F1.58.3%0D%0AHost%3A%20172.20.0.161%3A2379%0D%0A%0D%0A%0D%0A\\\\\\\">webmaster</a>.</p>\\\\n<br>\\\\n</div>\\\\n\\\\n<hr>\\\\n<div id=\\\\\\\"footer\\\\\\\">\\\\n<p>Generated Fri, 12 Jan 2024 18:36:45 GMT by 7491d61a006d (squid/3.5.27)</p>\\\\n<!-- ERR_ACCESS_DENIED -->\\\\n</div>\\\\n</body></html>\\\\n\\\"\""}
Jan 12 18:36:50 k3s-master-2 k3s[256819]: {"level":"info","ts":"2024-01-12T18:36:50.53063Z","logger":"etcd-client","caller":"v3@v3.5.9-k3s1/client.go:210","msg":"Auto sync endpoints failed.","error":"context deadline exceeded"}
Jan 12 18:36:53 k3s-master-2 k3s[256819]: {"level":"warn","ts":"2024-01-12T18:36:53.523353Z","logger":"etcd-client","caller":"v3@v3.5.9-k3s1/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0012be000/172.20.0.161:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing: failed to do connect handshake, response: \\\"HTTP/1.1 403 Forbidden\\\\r\\\\nContent-Length: 3478\\\\r\\\\nConnection: keep-alive\\\\r\\\\nContent-Language: en\\\\r\\\\nContent-Type: text/html;charset=utf-8\\\\r\\\\nDate: Fri, 12 Jan 2024 18:36:52 GMT\\\\r\\\\nMime-Version: 1.0\\\\r\\\\nServer: squid/3.5.27\\\\r\\\\nVary: Accept-Language\\\\r\\\\nVia: 1.1 7491d61a006d (squid/3.5.27)\\\\r\\\\nX-Cache: MISS from 7491d61a006d\\\\r\\\\nX-Cache-Lookup: NONE from 7491d61a006d:3128\\\\r\\\\nX-Squid-Error: ERR_ACCESS_DENIED 0\\\\r\\\\n\\\\r\\\\n<!DOCTYPE html PUBLIC \\\\\\\"-//W3C//DTD HTML 4.01//EN\\\\\\\" \\\\\\\"http://www.w3.org/TR/html4/strict.dtd\\\\\\\">\\\\n<html><head>\\\\n<meta type=\\\\\\\"copyright\\\\\\\" content=\\\\\\\"Copyright (C) 1996-2017 The Squid Software Foundation and contributors\\\\\\\">\\\\n<meta http-equiv=\\\\\\\"Content-Type\\\\\\\" content=\\\\\\\"text/html; charset=utf-8\\\\\\\">\\\\n<title>ERROR: The requested URL could not be retrieved</title>\\\\n<style type=\\\\\\\"text/css\\\\\\\"><!--\\\\n /*\\\\n * Copyright (C) 1996-2017 The Squid Software Foundation and contributors\\\\n *\\\\n * Squid software is distributed under GPLv2+ license and includes\\\\n * contributions from numerous individuals and organizations.\\\\n * Please see the COPYING and CONTRIBUTORS files for details.\\\\n */\\\\n\\\\n/*\\\\n Stylesheet for Squid Error pages\\\\n Adapted from design by Free CSS Templates\\\\n http://www.freecsstemplates.org\\\\n Released for free under a Creative Commons Attribution 2.5 License\\\\n*/\\\\n\\\\n/* Page basics */\\\\n* {\\\\n\\\\tfont-family: verdana, sans-serif;\\\\n}\\\\n\\\\nhtml body {\\\\n\\\\tmargin: 0;\\\\n\\\\tpadding: 0;\\\\n\\\\tbackground: #efefef;\\\\n\\\\tfont-size: 12px;\\\\n\\\\tcolor: #1e1e1e;\\\\n}\\\\n\\\\n/* Page displayed title area */\\\\n#titles {\\\\n\\\\tmargin-left: 15px;\\\\n\\\\tpadding: 10px;\\\\n\\\\tpadding-left: 100px;\\\\n\\\\tbackground: url('/squid-internal-static/icons/SN.png') no-repeat left;\\\\n}\\\\n\\\\n/* initial title */\\\\n#titles h1 {\\\\n\\\\tcolor: #000000;\\\\n}\\\\n#titles h2 {\\\\n\\\\tcolor: #000000;\\\\n}\\\\n\\\\n/* special event: FTP success page titles */\\\\n#titles ftpsuccess {\\\\n\\\\tbackground-color:#00ff00;\\\\n\\\\twidth:100%;\\\\n}\\\\n\\\\n/* Page displayed body content area */\\\\n#content {\\\\n\\\\tpadding: 10px;\\\\n\\\\tbackground: #ffffff;\\\\n}\\\\n\\\\n/* General text */\\\\np {\\\\n}\\\\n\\\\n/* error brief description */\\\\n#error p {\\\\n}\\\\n\\\\n/* some data which may have caused the problem */\\\\n#data {\\\\n}\\\\n\\\\n/* the error message received from the system or other software */\\\\n#sysmsg {\\\\n}\\\\n\\\\npre {\\\\n    font-family:sans-serif;\\\\n}\\\\n\\\\n/* special event: FTP / Gopher directory listing */\\\\n#dirmsg {\\\\n    font-family: courier;\\\\n    color: black;\\\\n    font-size: 10pt;\\\\n}\\\\n#dirlisting {\\\\n    margin-left: 2%;\\\\n    margin-right: 2%;\\\\n}\\\\n#dirlisting tr.entry td.icon,td.filename,td.size,td.date {\\\\n    border-bottom: groove;\\\\n}\\\\n#dirlisting td.size {\\\\n    width: 50px;\\\\n    text-align: right;\\\\n    padding-right: 5px;\\\\n}\\\\n\\\\n/* horizontal lines */\\\\nhr {\\\\n\\\\tmargin: 0;\\\\n}\\\\n\\\\n/* page displayed footer area */\\\\n#footer {\\\\n\\\\tfont-size: 9px;\\\\n\\\\tpadding-left: 10px;\\\\n}\\\\n\\\\n\\\\nbody\\\\n:lang(fa) { direction: rtl; font-size: 100%; font-family: Tahoma, Roya, sans-serif; float: right; }\\\\n:lang(he) { direction: rtl; }\\\\n --></style>\\\\n</head><body id=ERR_ACCESS_DENIED>\\\\n<div id=\\\\\\\"titles\\\\\\\">\\\\n<h1>ERROR</h1>\\\\n<h2>The requested URL could not be retrieved</h2>\\\\n</div>\\\\n<hr>\\\\n\\\\n<div id=\\\\\\\"content\\\\\\\">\\\\n<p>The following error was encountered while trying to retrieve the URL: <a href=\\\\\\\"172.20.0.161:2379\\\\\\\">172.20.0.161:2379</a></p>\\\\n\\\\n<blockquote id=\\\\\\\"error\\\\\\\">\\\\n<p><b>Access Denied.</b></p>\\\\n</blockquote>\\\\n\\\\n<p>Access control configuration prevents your request from being allowed at this time. Please contact your service provider if you feel this is incorrect.</p>\\\\n\\\\n<p>Your cache administrator is <a href=\\\\\\\"mailto:webmaster?subject=CacheErrorInfo%20-%20ERR_ACCESS_DENIED&amp;body=CacheHost%3A%207491d61a006d%0D%0AErrPage%3A%20ERR_ACCESS_DENIED%0D%0AErr%3A%20%5Bnone%5D%0D%0ATimeStamp%3A%20Fri,%2012%20Jan%202024%2018%3A36%3A52%20GMT%0D%0A%0D%0AClientIP%3A%2047.43.111.84%0D%0A%0D%0AHTTP%20Request%3A%0D%0ACONNECT%20%2F%20HTTP%2F1.1%0AUser-Agent%3A%20grpc-go%2F1.58.3%0D%0AHost%3A%20172.20.0.161%3A2379%0D%0A%0D%0A%0D%0A\\\\\\\">webmaster</a>.</p>\\\\n<br>\\\\n</div>\\\\n\\\\n<hr>\\\\n<div id=\\\\\\\"footer\\\\\\\">\\\\n<p>Generated Fri, 12 Jan 2024 18:36:52 GMT by 7491d61a006d (squid/3.5.27)</p>\\\\n<!-- ERR_ACCESS_DENIED -->\\\\n</div>\\\\n</body></html>\\\\n\\\"\""}
Jan 12 18:36:53 k3s-master-2 k3s[256819]: time="2024-01-12T18:36:53Z" level=fatal msg="etcd cluster join failed: context deadline exceeded"
Jan 12 18:36:53 k3s-master-2 systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
Jan 12 18:36:53 k3s-master-2 systemd[1]: k3s.service: Failed with result 'exit-code'.
Jan 12 18:36:53 k3s-master-2 systemd[1]: Failed to start Lightweight Kubernetes.
Jan 12 18:36:53 k3s-master-2 systemd[1]: k3s.service: Consumed 24.604s CPU time.
Jan 12 18:36:58 k3s-master-2 systemd[1]: k3s.service: Scheduled restart job, restart counter is at 3.
Jan 12 18:36:58 k3s-master-2 systemd[1]: Stopped Lightweight Kubernetes.
Jan 12 18:36:58 k3s-master-2 systemd[1]: k3s.service: Consumed 24.605s CPU time.

Expected behavior: Second node should join cluster

Actual behavior: Second node fails to join cluster

bmorris53 commented 8 months ago

Also, using nc I can confirm that the master-1 node is reachable on all the ports I believe should be reachable.

from master-2:


[bmorris@k3s-master-2 ~]$ nc -v 172.20.0.161 6443
Ncat: Version 7.92 ( https://nmap.org/ncat )
Ncat: Connected to 172.20.0.161:6443.
^C
[bmorris@k3s-master-2 ~]$ nc -v 172.20.0.161 2379
Ncat: Version 7.92 ( https://nmap.org/ncat )
Ncat: Connected to 172.20.0.161:2379.
^C
[bmorris@k3s-master-2 ~]$ nc -v 172.20.0.161 2380
Ncat: Version 7.92 ( https://nmap.org/ncat )
Ncat: Connected to 172.20.0.161:2380.
brandond commented 8 months ago

Maybe you missed the 403 Forbidden errors from squid in the log file:

Jan 12 18:36:50 k3s-master-2 k3s[256819]: {"level":"warn","ts":"2024-01-12T18:36:50.529306Z","logger":"etcd-client","caller":"v3@v3.5.9-k3s1/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0012be000/172.20.0.161:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing: failed to do connect handshake, response: \\\"HTTP/1.1 403 Forbidden\\\\r\\\\nContent-Length: 3478\\\\r\\\\nConnection: keep-alive\\\\r\\\\nContent-Language: en\\\\r\\\\nContent-Type: text/html;charset=utf-8\\\\r\\\\nDate: Fri, 12 Jan 2024 18:36:45 GMT\\\\r\\\\nMime-Version: 1.0\\\\r\\\\nServer: squid/3.5.27\\\\r\\\\nVary: Accept-Language\\\\r\\\\nVia: 1.1 7491d61a006d (squid/3.5.27)\\\\r\\\\nX-Cache: MISS from 7491d61a006d\\\\r\\\\nX-Cache-Lookup: NONE from 7491d61a006d:3128\\\\r\\\\nX-Squid-Error: ERR_ACCESS_DENIED 0\\\\r\\\\n\\\\r\\\\n<!DOCTYPE html PUBLIC \\\\\\\"-//W3C//DTD HTML 4.01//EN\\\\\\\" \\\\\\\"http://www.w3.org/TR/html4/strict.dtd\\\\\\\">\\\\n<html><head>\\\\n<meta type=\\\\\\\"copyright\\\\\\\" content=\\\\\\\"Copyright (C) 1996-2017 The Squid Software Foundation and contributors\\\\\\\">\\\\n<meta http-equiv=\\\\\\\"Content-Type\\\\\\\" content=\\\\\\\"text/html; charset=utf-8\\\\\\\">\\\\n<title>ERROR: The requested URL could not be retrieved</title>\\\\n<style type=\\\\\\\"text/css\\\\\\\"><!--\\\\n /*\\\\n * Copyright (C) 1996-2017 The Squid Software Foundation and contributors\\\\n *\\\\n * Squid software is distributed under GPLv2+ license and includes\\\\n * contributions from numerous individuals and organizations.\\\\n * Please see the COPYING and CONTRIBUTORS files for details.\\\\n */\\\\n\\\\n/*\\\\n Stylesheet for Squid Error pages\\\\n Adapted from design by Free CSS Templates\\\\n http://www.freecsstemplates.org\\\\n Released for free under a Creative Commons Attribution 2.5 License\\\\n*/\\\\n\\\\n/* Page basics */\\\\n* {\\\\n\\\\tfont-family: verdana, sans-serif;\\\\n}\\\\n\\\\nhtml body {\\\\n\\\\tmargin: 0;\\\\n\\\\tpadding: 0;\\\\n\\\\tbackground: #efefef;\\\\n\\\\tfont-size: 12px;\\\\n\\\\tcolor: #1e1e1e;\\\\n}\\\\n\\\\n/* Page displayed title area */\\\\n#titles {\\\\n\\\\tmargin-left: 15px;\\\\n\\\\tpadding: 10px;\\\\n\\\\tpadding-left: 100px;\\\\n\\\\tbackground: url('/squid-internal-static/icons/SN.png') no-repeat left;\\\\n}\\\\n\\\\n/* initial title */\\\\n#titles h1 {\\\\n\\\\tcolor: #000000;\\\\n}\\\\n#titles h2 {\\\\n\\\\tcolor: #000000;\\\\n}\\\\n\\\\n/* special event: FTP success page titles */\\\\n#titles ftpsuccess {\\\\n\\\\tbackground-color:#00ff00;\\\\n\\\\twidth:100%;\\\\n}\\\\n\\\\n/* Page displayed body content area */\\\\n#content {\\\\n\\\\tpadding: 10px;\\\\n\\\\tbackground: #ffffff;\\\\n}\\\\n\\\\n/* General text */\\\\np {\\\\n}\\\\n\\\\n/* error brief description */\\\\n#error p {\\\\n}\\\\n\\\\n/* some data which may have caused the problem */\\\\n#data {\\\\n}\\\\n\\\\n/* the error message received from the system or other software */\\\\n#sysmsg {\\\\n}\\\\n\\\\npre {\\\\n font-family:sans-serif;\\\\n}\\\\n\\\\n/* special event: FTP / Gopher directory listing */\\\\n#dirmsg {\\\\n font-family: courier;\\\\n color: black;\\\\n font-size: 10pt;\\\\n}\\\\n#dirlisting {\\\\n margin-left: 2%;\\\\n margin-right: 2%;\\\\n}\\\\n#dirlisting tr.entry td.icon,td.filename,td.size,td.date {\\\\n border-bottom: groove;\\\\n}\\\\n#dirlisting td.size {\\\\n width: 50px;\\\\n text-align: right;\\\\n padding-right: 5px;\\\\n}\\\\n\\\\n/* horizontal lines */\\\\nhr {\\\\n\\\\tmargin: 0;\\\\n}\\\\n\\\\n/* page displayed footer area */\\\\n#footer {\\\\n\\\\tfont-size: 9px;\\\\n\\\\tpadding-left: 10px;\\\\n}\\\\n\\\\n\\\\nbody\\\\n:lang(fa) { direction: rtl; font-size: 100%; font-family: Tahoma, Roya, sans-serif; float: right; }\\\\n:lang(he) { direction: rtl; }\\\\n --></style>\\\\n</head><body id=ERR_ACCESS_DENIED>\\\\n<div id=\\\\\\\"titles\\\\\\\">\\\\n<h1>ERROR</h1>\\\\n<h2>The requested URL could not be retrieved</h2>\\\\n</div>\\\\n<hr>\\\\n\\\\n<div id=\\\\\\\"content\\\\\\\">\\\\n<p>The following error was encountered while trying to retrieve the URL: <a href=\\\\\\\"172.20.0.161:2379\\\\\\\">172.20.0.161:2379</a></p>\\\\n\\\\n<blockquote id=\\\\\\\"error\\\\\\\">\\\\n<p><b>Access Denied.</b></p>\\\\n</blockquote>\\\\n\\\\n<p>Access control configuration prevents your request from being allowed at this time. Please contact your service provider if you feel this is incorrect.</p>\\\\n\\\\n<p>Your cache administrator is <a href=\\\\\\\"mailto:webmaster?subject=CacheErrorInfo%20-%20ERR_ACCESS_DENIED&amp;body=CacheHost%3A%207491d61a006d%0D%0AErrPage%3A%20ERR_ACCESS_DENIED%0D%0AErr%3A%20%5Bnone%5D%0D%0ATimeStamp%3A%20Fri,%2012%20Jan%202024%2018%3A36%3A45%20GMT%0D%0A%0D%0AClientIP%3A%2047.43.111.84%0D%0A%0D%0AHTTP%20Request%3A%0D%0ACONNECT%20%2F%20HTTP%2F1.1%0AUser-Agent%3A%20grpc-go%2F1.58.3%0D%0AHost%3A%20172.20.0.161%3A2379%0D%0A%0D%0A%0D%0A\\\\\\\">webmaster</a>.</p>\\\\n<br>\\\\n</div>\\\\n\\\\n<hr>\\\\n<div id=\\\\\\\"footer\\\\\\\">\\\\n<p>Generated Fri, 12 Jan 2024 18:36:45 GMT by 7491d61a006d (squid/3.5.27)</p>\\\\n<!-- ERR_ACCESS_DENIED -->\\\\n</div>\\\\n</body></html>\\\\n\\\"\""}

Your proxy is breaking etcd. If you have HTTP_PROXY or HTTP_PROXY variables present in your environment, make sure that your internal node IPs or IP ranges are included in the NO_PROXY list. If you don't want K3s to use the proxy, then remove the proxy vars from the k3s env file.