k3s-io / k3s

Lightweight Kubernetes
https://k3s.io
Apache License 2.0
28.1k stars 2.35k forks source link

Getting container logs has TLS issues, reconfiguration breaks etcd, situation unclear #6679

Closed toabi closed 1 year ago

toabi commented 1 year ago

Environmental Info

K3s Version: k3s version v1.23.15+k3s1 (50cab3b3) (on control-plane 0 and v1.23.14+k3s1 on other control-plane 1+2)

Update: I could reproduce the container logs issue also on v1.22 control-plane clusters already

Node(s) CPU architecture, OS, and Version: Linux core-control-plane-0 5.15.0-56-generic #62-Ubuntu SMP Tue Nov 22 19:54:14 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration: 3 Servers 11 Agents Each server has private and public IPv4 and IPv6. It's Hetzner Cloud Servers and k3s is installed with cloud-init and curl -sfL https://get.k3s.io with the "right" parameters.

The Bug/Story

I'm not sure if it's one and where exactly it is hiding… but here's the story so far:

I have a weird situation wich involves TLS certificates, Node-IPs and etcd.

This setup was now working pretty well for some time, but now I noticed some strange behaviour:

The three control plane servers have private IPs (10.0.1.1, 10.0.1.2, 10.0.1.3) next to the public ipv4s and ipv6s.

Getting container logs from control-plane servers leads to TLS issue

Yesterday while I was upgrading from v1.22 to v1.23.15 I noticed some strange behavior when using kubectl to get container logs from pods running on control-plane servers. This issue doesn't exist on the agents (I updated one agent to 1.23 to test it):

Failed to load logs: Get "https://10.0.1.1:10250/containerLogs/kube-system/hcloud-csi-node-7vcrq/csi-node-driver-registrar?tailLines=502&timestamps=true": x509: certificate is valid for 127.0.0.1, <public-ipv4>, <public-ipv6>, not 10.0.1.1

So I was digging and found out that I only pass --node-ip to the agent servers on startup but not on the control-plane servers. Maybe that leads to the problem that they don’t use that IP for their certificates?

Doing a openssl s_client -connect 10.0.1.1:10250 < /dev/null | openssl x509 -text shows me that the SANs are indeed only DNS:core-control-plane-0, DNS:localhost, IP Address:127.0.0.1, IP Address:<public-ipv4>, IP Address:2A01:4F8:1C1C:E8D2:0:0:0:1

So I thought… Aha, let’s add --node-ip to the startup of the control-plane servers. Then the next strange thing happened:

Adding node-ip to an existing control-plane server makes it unable to join etcd again

So I added --node-ip 10.0.1.1 to the commandline args (systemctl edit k3s.service --full and restarting it).

On control-plane-1 I can now see this:

Jan 03 07:32:16 core-control-plane-1 k3s[1679]: {"level":"warn","ts":"2023-01-03T07:32:16.011Z","caller":"embed/config_logging.go:160","msg":"rejected connection","remote-addr":"<public-ipv4>:55600","server-name":"","ip-addresses":["10.0.1.1","127.0.0.1","::1","10.0.1.1","10.2.0.1"],"dns-names":["localhost","core-control-plane-0"],"error":"tls: \"<$PUBLIC_IPV4\" does not match any of DNSNames [\"localhost\" \"core-control-plane-0\"] (lookup core-control-plane-0: Try again)"}

On the control-plane-0 (where I added the node-ip) I see those logs:

Jan 03 07:33:52 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:33:52.294Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"6f1cea21cf164812 is starting a new election at term 46"}
Jan 03 07:33:52 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:33:52.294Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"6f1cea21cf164812 became pre-candidate at term 46"}
Jan 03 07:33:52 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:33:52.294Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"6f1cea21cf164812 received MsgPreVoteResp from 6f1cea21cf164812 at term 46"}
Jan 03 07:33:52 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:33:52.294Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"6f1cea21cf164812 [logterm: 46, index: 337443916] sent MsgPreVote request to 378283a013db6ca0 at term 46"}
Jan 03 07:33:52 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:33:52.294Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"6f1cea21cf164812 [logterm: 46, index: 337443916] sent MsgPreVote request to 495f61aec428708b at term 46"}
Jan 03 07:33:53 core-control-plane-0 k3s[7264]: time="2023-01-03T07:33:53Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:6443/v1-k3s/readyz: 500 Internal Server Error"
Jan 03 07:33:56 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:33:56.883Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"495f61aec428708b","rtt":"0s","error":"EOF"}
Jan 03 07:33:56 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:33:56.883Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"495f61aec428708b","rtt":"0s","error":"EOF"}
Jan 03 07:33:56 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:33:56.884Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"378283a013db6ca0","rtt":"0s","error":"EOF"}
Jan 03 07:33:56 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:33:56.884Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"378283a013db6ca0","rtt":"0s","error":"EOF"}
Jan 03 07:33:56 core-control-plane-0 k3s[7264]: time="2023-01-03T07:33:56Z" level=info msg="Tunnel server egress proxy waiting for runtime core to become available"
Jan 03 07:33:58 core-control-plane-0 k3s[7264]: time="2023-01-03T07:33:58Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:6443/v1-k3s/readyz: 500 Internal Server Error"
Jan 03 07:34:01 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:34:01.294Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"6f1cea21cf164812 is starting a new election at term 46"}

(which repeat multiple times.)

Full startup logs until repeating error ``` Jan 03 07:31:36 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:36Z" level=info msg="Starting k3s v1.23.15+k3s1 (50cab3b3)" Jan 03 07:31:36 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:36Z" level=info msg="Managed etcd cluster bootstrap already complete and initialized" Jan 03 07:31:36 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:36Z" level=warning msg="Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash. Use the full token from the server's node-token file to enable Cluster CA validation." Jan 03 07:31:36 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:36Z" level=info msg="Reconciling bootstrap data between datastore and disk" Jan 03 07:31:36 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:36Z" level=info msg="Successfully reconciled with datastore" Jan 03 07:31:36 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:36Z" level=info msg="certificate CN=kube-apiserver signed by CN=k3s-server-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 07:31:36 +0000 UTC" Jan 03 07:31:36 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:36Z" level=info msg="certificate CN=etcd-server signed by CN=etcd-server-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 07:31:36 +0000 UTC" Jan 03 07:31:36 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:36Z" level=info msg="certificate CN=etcd-peer signed by CN=etcd-peer-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 07:31:36 +0000 UTC" Jan 03 07:31:36 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:36Z" level=info msg="Starting etcd for existing cluster member" Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.660Z","caller":"embed/etcd.go:131","msg":"configuring peer listeners","listen-peer-urls":["https://10.0.1.1:2380","https://127.0.0.1:2380"]} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.660Z","caller":"embed/etcd.go:479","msg":"starting with peer TLS","tls-info":"cert = /var/lib/rancher/k3s/server/tls/etcd/peer-server-client.crt, key = /var/lib/rancher/k3s/server/tls/etcd/peer-server-client.key, client-cert=, client-k> Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.660Z","caller":"embed/etcd.go:139","msg":"configuring client listeners","listen-client-urls":["https://10.0.1.1:2379","https://127.0.0.1:2379"]} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.660Z","caller":"embed/etcd.go:308","msg":"starting an etcd server","etcd-version":"3.5.4","git-sha":"Not provided (use ./build instead of go build)","go-version":"go1.17.13","go-os":"linux","go-arch":"amd64","max-cpu-set":2,"max-cpu-av> Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.676Z","caller":"etcdserver/backend.go:81","msg":"opened backend db","path":"/var/lib/rancher/k3s/server/db/etcd/member/snap/db","took":"15.081007ms"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.747Z","caller":"etcdserver/server.go:508","msg":"recovered v2 store from snapshot","snapshot-index":337442520,"snapshot-size":"58 kB"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.747Z","caller":"etcdserver/server.go:521","msg":"recovered v3 backend from snapshot","backend-size-bytes":54046720,"backend-size":"54 MB","backend-size-in-use-bytes":52322304,"backend-size-in-use":"52 MB"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.792Z","caller":"etcdserver/raft.go:483","msg":"restarting local member","cluster-id":"d7e48328fbade5a9","local-member-id":"6f1cea21cf164812","commit-index":337443915} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.792Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"6f1cea21cf164812 switched to configuration voters=(3999904142609575072 5287051890799440011 8006531668487063570)"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.793Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"6f1cea21cf164812 became follower at term 46"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.793Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"newRaft 6f1cea21cf164812 [peers: [378283a013db6ca0,495f61aec428708b,6f1cea21cf164812], term: 46, commit: 337443915, applied: 337442520, lastindex: 337443916, lastterm: 4> Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.793Z","caller":"api/capability.go:75","msg":"enabled capabilities for version","cluster-version":"3.5"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.793Z","caller":"membership/cluster.go:278","msg":"recovered/added member from store","cluster-id":"d7e48328fbade5a9","local-member-id":"6f1cea21cf164812","recovered-remote-peer-id":"378283a013db6ca0","recovered-remote-peer-urls":["http> Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.793Z","caller":"membership/cluster.go:278","msg":"recovered/added member from store","cluster-id":"d7e48328fbade5a9","local-member-id":"6f1cea21cf164812","recovered-remote-peer-id":"495f61aec428708b","recovered-remote-peer-urls":["http> Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.793Z","caller":"membership/cluster.go:278","msg":"recovered/added member from store","cluster-id":"d7e48328fbade5a9","local-member-id":"6f1cea21cf164812","recovered-remote-peer-id":"6f1cea21cf164812","recovered-remote-peer-urls":["http> Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.793Z","caller":"membership/cluster.go:287","msg":"set cluster version from store","cluster-version":"3.5"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:36.794Z","caller":"auth/store.go:1220","msg":"simple token is not cryptographically signed"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.795Z","caller":"mvcc/kvstore.go:345","msg":"restored last compact revision","meta-bucket-name":"meta","meta-bucket-name-key":"finishedCompactRev","restored-compact-revision":305453105} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.846Z","caller":"mvcc/kvstore.go:415","msg":"kvstore restored","current-rev":305456694} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.848Z","caller":"etcdserver/quota.go:94","msg":"enabled backend quota with default value","quota-name":"v3-applier","quota-size-bytes":2147483648,"quota-size":"2.1 GB"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.849Z","caller":"rafthttp/peer.go:133","msg":"starting remote peer","remote-peer-id":"378283a013db6ca0"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.849Z","caller":"rafthttp/pipeline.go:72","msg":"started HTTP pipelining with remote peer","local-member-id":"6f1cea21cf164812","remote-peer-id":"378283a013db6ca0"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.850Z","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"6f1cea21cf164812","remote-peer-id":"378283a013db6ca0"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.850Z","caller":"rafthttp/peer.go:137","msg":"started remote peer","remote-peer-id":"378283a013db6ca0"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.850Z","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream Message","local-member-id":"6f1cea21cf164812","remote-peer-id":"378283a013db6ca0"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.850Z","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"6f1cea21cf164812","remote-peer-id":"378283a013db6ca0"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.850Z","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"6f1cea21cf164812","remote-peer-id":"378283a013db6ca0"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.850Z","caller":"rafthttp/transport.go:317","msg":"added remote peer","local-member-id":"6f1cea21cf164812","remote-peer-id":"378283a013db6ca0","remote-peer-urls":["https://23.88.123.167:2380"]} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.851Z","caller":"rafthttp/peer.go:133","msg":"starting remote peer","remote-peer-id":"495f61aec428708b"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.852Z","caller":"rafthttp/pipeline.go:72","msg":"started HTTP pipelining with remote peer","local-member-id":"6f1cea21cf164812","remote-peer-id":"495f61aec428708b"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.853Z","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"6f1cea21cf164812","remote-peer-id":"495f61aec428708b"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.853Z","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"6f1cea21cf164812","remote-peer-id":"495f61aec428708b"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.854Z","caller":"rafthttp/peer.go:137","msg":"started remote peer","remote-peer-id":"495f61aec428708b"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.855Z","caller":"rafthttp/transport.go:317","msg":"added remote peer","local-member-id":"6f1cea21cf164812","remote-peer-id":"495f61aec428708b","remote-peer-urls":["https://167.235.19.84:2380"]} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.855Z","caller":"etcdserver/corrupt.go:46","msg":"starting initial corruption check","local-member-id":"6f1cea21cf164812","timeout":"15s"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.854Z","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"6f1cea21cf164812","remote-peer-id":"495f61aec428708b"} Jan 03 07:31:36 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:36.854Z","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream Message","local-member-id":"6f1cea21cf164812","remote-peer-id":"495f61aec428708b"} Jan 03 07:31:41 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:41.852Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"378283a013db6ca0","rtt":"0s"} Jan 03 07:31:41 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:41.852Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"378283a013db6ca0","rtt":"0s"} Jan 03 07:31:41 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:41.855Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"495f61aec428708b","rtt":"0s"} Jan 03 07:31:41 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:41.855Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"495f61aec428708b","rtt":"0s"} Jan 03 07:31:41 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:41.889Z","caller":"etcdserver/corrupt.go:289","msg":"failed hash kv request","local-member-id":"6f1cea21cf164812","requested-revision":305456694,"remote-peer-endpoint":"https://23.88.123.167:2380","error":"Get \"https://23.88.123.167:2380> Jan 03 07:31:46 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:46.854Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"378283a013db6ca0","rtt":"0s","error":"EOF"} Jan 03 07:31:46 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:46.854Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"378283a013db6ca0","rtt":"0s","error":"EOF"} Jan 03 07:31:46 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:46.856Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"495f61aec428708b","rtt":"0s","error":"EOF"} Jan 03 07:31:46 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:46.857Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"495f61aec428708b","rtt":"0s","error":"EOF"} Jan 03 07:31:46 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:46.904Z","caller":"etcdserver/corrupt.go:289","msg":"failed hash kv request","local-member-id":"6f1cea21cf164812","requested-revision":305456694,"remote-peer-endpoint":"https://167.235.19.84:2380","error":"Get \"https://167.235.19.84:2380> Jan 03 07:31:46 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:46.904Z","caller":"etcdserver/corrupt.go:116","msg":"initial corruption checking passed; no corruption","local-member-id":"6f1cea21cf164812"} Jan 03 07:31:46 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:46.904Z","caller":"etcdserver/server.go:842","msg":"starting etcd server","local-member-id":"6f1cea21cf164812","local-server-version":"3.5.4","cluster-id":"d7e48328fbade5a9","cluster-version":"3.5"} Jan 03 07:31:46 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:46.906Z","caller":"etcdserver/server.go:752","msg":"starting initial election tick advance","election-ticks":10} Jan 03 07:31:46 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:46.908Z","caller":"embed/etcd.go:688","msg":"starting with client TLS","tls-info":"cert = /var/lib/rancher/k3s/server/tls/etcd/server-client.crt, key = /var/lib/rancher/k3s/server/tls/etcd/server-client.key, client-cert=, client-key=, tru> Jan 03 07:31:46 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:46.909Z","caller":"embed/etcd.go:277","msg":"now serving peer/client/metrics","local-member-id":"6f1cea21cf164812","initial-advertise-peer-urls":["http://localhost:2380"],"listen-peer-urls":["https://10.0.1.1:2380","https://127.0.0.1:2380> Jan 03 07:31:46 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:46Z" level=info msg="Running kube-apiserver --advertise-address=10.0.1.1 --advertise-port=6443 --allow-privileged=true --anonymous-auth=false --api-audiences=https://kubernetes.default.svc.cluster.local,k3s --authorization-mode=Node,RBAC --bind-address=1> Jan 03 07:31:46 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:46Z" level=info msg="Running kube-scheduler --authentication-kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --authorization-kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --bind-address=127.0.0.1 --kubeconfig=/var/lib/> Jan 03 07:31:46 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:46.913Z","caller":"embed/etcd.go:581","msg":"serving peer traffic","address":"10.0.1.1:2380"} Jan 03 07:31:46 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:46.914Z","caller":"embed/etcd.go:553","msg":"cmux::serve","address":"10.0.1.1:2380"} Jan 03 07:31:46 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:46.914Z","caller":"embed/etcd.go:581","msg":"serving peer traffic","address":"127.0.0.1:2380"} Jan 03 07:31:46 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:46.914Z","caller":"embed/etcd.go:553","msg":"cmux::serve","address":"127.0.0.1:2380"} Jan 03 07:31:46 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:46.914Z","caller":"embed/etcd.go:763","msg":"serving metrics","address":"http://127.0.0.1:2381"} Jan 03 07:31:46 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:46Z" level=info msg="Running kube-controller-manager --allocate-node-cidrs=true --authentication-kubeconfig=/var/lib/rancher/k3s/server/cred/controller.kubeconfig --authorization-kubeconfig=/var/lib/rancher/k3s/server/cred/controller.kubeconfig --bind-ad> Jan 03 07:31:46 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:46Z" level=info msg="Tunnel server egress proxy mode: agent" Jan 03 07:31:46 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:46Z" level=info msg="Tunnel server egress proxy waiting for runtime core to become available" Jan 03 07:31:46 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:46Z" level=info msg="Server node token is available at /var/lib/rancher/k3s/server/token" Jan 03 07:31:46 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:46Z" level=info msg="To join server node to cluster: k3s server -s https://:6443 -t ${SERVER_NODE_TOKEN}" Jan 03 07:31:46 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:46Z" level=info msg="Agent node token is available at /var/lib/rancher/k3s/server/agent-token" Jan 03 07:31:46 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:46Z" level=info msg="To join agent node to cluster: k3s agent -s https://:6443 -t ${AGENT_NODE_TOKEN}" Jan 03 07:31:46 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:46Z" level=info msg="Wrote kubeconfig /etc/rancher/k3s/k3s.yaml" Jan 03 07:31:46 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:46Z" level=info msg="Run: k3s kubectl" Jan 03 07:31:46 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:46Z" level=info msg="certificate CN=core-control-plane-0 signed by CN=k3s-server-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 07:31:46 +0000 UTC" Jan 03 07:31:46 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:46Z" level=info msg="certificate CN=system:node:core-control-plane-0,O=system:nodes signed by CN=k3s-client-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 07:31:46 +0000 UTC" Jan 03 07:31:47 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:47Z" level=info msg="Module overlay was already loaded" Jan 03 07:31:47 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:47Z" level=info msg="Module nf_conntrack was already loaded" Jan 03 07:31:47 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:47Z" level=info msg="Module br_netfilter was already loaded" Jan 03 07:31:47 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:47Z" level=info msg="Module iptable_nat was already loaded" Jan 03 07:31:47 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:47Z" level=info msg="Module iptable_filter was already loaded" Jan 03 07:31:47 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:47Z" level=warning msg="The wireguard backend is deprecated and will be removed in k3s v1.26, please switch to wireguard-native. Check our docs for information about how to migrate." Jan 03 07:31:47 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:47Z" level=info msg="Logging containerd to /var/lib/rancher/k3s/agent/containerd/containerd.log" Jan 03 07:31:47 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:47Z" level=info msg="Running containerd -c /var/lib/rancher/k3s/agent/etc/containerd/config.toml -a /run/k3s/containerd/containerd.sock --state /run/k3s/containerd --root /var/lib/rancher/k3s/agent/containerd" Jan 03 07:31:48 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:48Z" level=info msg="Containerd is now running" Jan 03 07:31:48 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:48Z" level=info msg="Running kubelet --address=0.0.0.0 --anonymous-auth=false --authentication-token-webhook=true --authorization-mode=Webhook --cgroup-driver=systemd --client-ca-file=/var/lib/rancher/k3s/agent/client-ca.crt --cloud-provider=external --c> Jan 03 07:31:48 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:48Z" level=info msg="Connecting to proxy" url="wss://127.0.0.1:6443/v1-k3s/connect" Jan 03 07:31:48 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:48Z" level=info msg="Handling backend connection request [core-control-plane-0]" Jan 03 07:31:48 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:48Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:6443/v1-k3s/readyz: 500 Internal Server Error" Jan 03 07:31:51 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:51Z" level=info msg="Handling backend connection request [core-storage-0-trusty-piranha]" Jan 03 07:31:51 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:51Z" level=info msg="Handling backend connection request [core-control-plane-2]" Jan 03 07:31:51 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:51Z" level=info msg="Handling backend connection request [core-storage-2-capital-jackass]" Jan 03 07:31:51 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:51Z" level=info msg="Handling backend connection request [core-small-2-fun-lacewing]" Jan 03 07:31:51 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:51Z" level=info msg="Handling backend connection request [core-medium-0-united-eft]" Jan 03 07:31:51 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:51Z" level=info msg="Handling backend connection request [core-small-0-mint-sawfly]" Jan 03 07:31:51 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:51Z" level=info msg="Handling backend connection request [core-storage-1-major-bedbug]" Jan 03 07:31:51 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:51Z" level=info msg="Handling backend connection request [core-medium-3-tight-halibut]" Jan 03 07:31:51 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:51Z" level=info msg="Handling backend connection request [core-small-1-allowed-treefrog]" Jan 03 07:31:51 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:51Z" level=info msg="Handling backend connection request [core-control-plane-1]" Jan 03 07:31:51 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:51Z" level=info msg="Handling backend connection request [core-small-3-engaging-katydid]" Jan 03 07:31:51 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:51Z" level=info msg="Handling backend connection request [core-medium-2-becoming-hare]" Jan 03 07:31:51 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:51Z" level=info msg="Handling backend connection request [core-medium-1-renewing-hookworm]" Jan 03 07:31:51 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:51.854Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"378283a013db6ca0","rtt":"0s","error":"EOF"} Jan 03 07:31:51 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:51.854Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"378283a013db6ca0","rtt":"0s","error":"EOF"} Jan 03 07:31:51 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:51.857Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"495f61aec428708b","rtt":"0s","error":"EOF"} Jan 03 07:31:51 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:51.857Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"495f61aec428708b","rtt":"0s","error":"EOF"} Jan 03 07:31:51 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:51Z" level=info msg="Tunnel server egress proxy waiting for runtime core to become available" Jan 03 07:31:52 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:52Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF" Jan 03 07:31:52 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:52Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF" Jan 03 07:31:52 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:52Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF" Jan 03 07:31:52 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:52Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF" Jan 03 07:31:52 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:52Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF" Jan 03 07:31:52 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:52Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF" Jan 03 07:31:52 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:52Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF" Jan 03 07:31:52 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:52Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF" Jan 03 07:31:52 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:52Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF" Jan 03 07:31:52 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:52Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF" Jan 03 07:31:52 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:52Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF" Jan 03 07:31:52 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:52Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF" Jan 03 07:31:52 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:52Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF" Jan 03 07:31:53 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:53Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:6443/v1-k3s/readyz: 500 Internal Server Error" Jan 03 07:31:55 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:55.294Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"6f1cea21cf164812 is starting a new election at term 46"} Jan 03 07:31:55 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:55.294Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"6f1cea21cf164812 became pre-candidate at term 46"} Jan 03 07:31:55 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:55.294Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"6f1cea21cf164812 received MsgPreVoteResp from 6f1cea21cf164812 at term 46"} Jan 03 07:31:55 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:55.294Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"6f1cea21cf164812 [logterm: 46, index: 337443916] sent MsgPreVote request to 378283a013db6ca0 at term 46"} Jan 03 07:31:55 core-control-plane-0 k3s[7264]: {"level":"info","ts":"2023-01-03T07:31:55.294Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"6f1cea21cf164812 [logterm: 46, index: 337443916] sent MsgPreVote request to 495f61aec428708b at term 46"} Jan 03 07:31:56 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:56.855Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"378283a013db6ca0","rtt":"0s","error":"EOF"} Jan 03 07:31:56 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:56.857Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"378283a013db6ca0","rtt":"0s","error":"EOF"} Jan 03 07:31:56 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:56.857Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"495f61aec428708b","rtt":"0s","error":"EOF"} Jan 03 07:31:56 core-control-plane-0 k3s[7264]: {"level":"warn","ts":"2023-01-03T07:31:56.857Z","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"495f61aec428708b","rtt":"0s","error":"EOF"} Jan 03 07:31:56 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:56Z" level=info msg="Tunnel server egress proxy waiting for runtime core to become available" Jan 03 07:31:58 core-control-plane-0 k3s[7264]: time="2023-01-03T07:31:58Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:6443/v1-k3s/readyz: 500 Internal Server Error" ```

Removing node-ip from the k3s args and restarting makes it join the etcd cluster again.

So that's my first big question: Is there a migration path to using --node-ip on servers?

--tls-san doesn't help

I also tried adding --tls-san to that servers startup commands but that didn't fix getting the logs. Maybe that is only evaluated on cluster-init?

Recreation instead of Reconfiguration fails differently

I also tried to add control-plane-0 as a fresh member to the cluster. (Deleting the server and creating it again but using --node-ip now already and also --advertise-address (both pointing to the private ipv4).

Full log of control-plane-0 ``` Jan 03 08:15:24 core-control-plane-0 systemd[1]: Starting Lightweight Kubernetes... Jan 03 08:15:24 core-control-plane-0 sh[1769]: + /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service Jan 03 08:15:24 core-control-plane-0 sh[1770]: Failed to get unit file state for nm-cloud-setup.service: No such file or directory Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Starting k3s v1.23.15+k3s1 (50cab3b3)" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=warning msg="Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash. Use the full token from the server's node-token file to enable Cluster CA validation." Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Managed etcd cluster not yet initialized" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=warning msg="Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash. Use the full token from the server's node-token file to enable Cluster CA validation." Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Reconciling bootstrap data between datastore and disk" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="certificate CN=system:admin,O=system:masters signed by CN=k3s-client-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 08:15:25 +0000 UTC" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="certificate CN=system:kube-controller-manager signed by CN=k3s-client-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 08:15:25 +0000 UTC" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="certificate CN=system:kube-scheduler signed by CN=k3s-client-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 08:15:25 +0000 UTC" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="certificate CN=system:apiserver,O=system:masters signed by CN=k3s-client-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 08:15:25 +0000 UTC" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="certificate CN=system:kube-proxy signed by CN=k3s-client-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 08:15:25 +0000 UTC" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="certificate CN=system:k3s-controller signed by CN=k3s-client-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 08:15:25 +0000 UTC" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="certificate CN=k3s-cloud-controller-manager signed by CN=k3s-client-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 08:15:25 +0000 UTC" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="certificate CN=kube-apiserver signed by CN=k3s-server-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 08:15:25 +0000 UTC" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="certificate CN=system:auth-proxy signed by CN=k3s-request-header-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 08:15:25 +0000 UTC" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="certificate CN=etcd-server signed by CN=etcd-server-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 08:15:25 +0000 UTC" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="certificate CN=etcd-client signed by CN=etcd-server-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 08:15:25 +0000 UTC" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="certificate CN=etcd-peer signed by CN=etcd-peer-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 08:15:25 +0000 UTC" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Running kube-apiserver --advertise-address=10.0.1.1 --advertise-port=6443 --allow-privileged=true --anonymous-auth=false --api-audiences=https://kubernetes.default.svc.cluster.local,k3s --authorization-mode=Node,RBAC --bind-address=127.0.0.1 --cert-dir=/var/lib/rancher/k3s/server/tls/temporary-certs --client-ca-file=/var/lib/rancher/k3s/server/tls/client-ca.crt --egress-selector-config-file=/var/lib/rancher/k3s/server/etc/egress-selector-config.yaml --enable-admission-plugins=NodeRestriction --enable-aggregator-routing=true --etcd-cafile=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt --etcd-certfile=/var/lib/rancher/k3s/server/tls/etcd/client.crt --etcd-keyfile=/var/lib/rancher/k3s/server/tls/etcd/client.key --etcd-servers=https://127.0.0.1:2379 --feature-gates=JobTrackingWithFinalizers=true --insecure-port=0 --kubelet-certificate-authority=/var/lib/rancher/k3s/server/tls/server-ca.crt --kubelet-client-certificate=/var/lib/rancher/k3s/server/tls/client-kube-apiserver.crt --kubelet-client-key=/var/lib/rancher/k3s/server/tls/client-kube-apiserver.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --profiling=false --proxy-client-cert-file=/var/lib/rancher/k3s/server/tls/client-auth-proxy.crt --proxy-client-key-file=/var/lib/rancher/k3s/server/tls/client-auth-proxy.key --requestheader-allowed-names=system:auth-proxy --requestheader-client-ca-file=/var/lib/rancher/k3s/server/tls/request-header-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6444 --service-account-issuer=https://kubernetes.default.svc.cluster.local --service-account-key-file=/var/lib/rancher/k3s/server/tls/service.key --service-account-signing-key-file=/var/lib/rancher/k3s/server/tls/service.key --service-cluster-ip-range=10.2.0.0/16 --service-node-port-range=30000-32767 --storage-backend=etcd3 --tls-cert-file=/var/lib/rancher/k3s/server/tls/serving-kube-apiserver.crt --tls-private-key-file=/var/lib/rancher/k3s/server/tls/serving-kube-apiserver.key" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Tunnel server egress proxy mode: agent" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Tunnel server egress proxy waiting for runtime core to become available" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Running kube-scheduler --authentication-kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --authorization-kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --bind-address=127.0.0.1 --kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --profiling=false --secure-port=10259" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Running kube-controller-manager --allocate-node-cidrs=true --authentication-kubeconfig=/var/lib/rancher/k3s/server/cred/controller.kubeconfig --authorization-kubeconfig=/var/lib/rancher/k3s/server/cred/controller.kubeconfig --bind-address=127.0.0.1 --cluster-cidr=10.1.0.0/16 --cluster-signing-kube-apiserver-client-cert-file=/var/lib/rancher/k3s/server/tls/client-ca.crt --cluster-signing-kube-apiserver-client-key-file=/var/lib/rancher/k3s/server/tls/client-ca.key --cluster-signing-kubelet-client-cert-file=/var/lib/rancher/k3s/server/tls/client-ca.crt --cluster-signing-kubelet-client-key-file=/var/lib/rancher/k3s/server/tls/client-ca.key --cluster-signing-kubelet-serving-cert-file=/var/lib/rancher/k3s/server/tls/server-ca.crt --cluster-signing-kubelet-serving-key-file=/var/lib/rancher/k3s/server/tls/server-ca.key --cluster-signing-legacy-unknown-cert-file=/var/lib/rancher/k3s/server/tls/server-ca.crt --cluster-signing-legacy-unknown-key-file=/var/lib/rancher/k3s/server/tls/server-ca.key --feature-gates=JobTrackingWithFinalizers=true --kubeconfig=/var/lib/rancher/k3s/server/cred/controller.kubeconfig --profiling=false --root-ca-file=/var/lib/rancher/k3s/server/tls/server-ca.crt --secure-port=10257 --service-account-private-key-file=/var/lib/rancher/k3s/server/tls/service.key --service-cluster-ip-range=10.2.0.0/16 --use-service-account-credentials=true" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Server node token is available at /var/lib/rancher/k3s/server/token" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="To join server node to cluster: k3s server -s https://:6443 -t ${SERVER_NODE_TOKEN}" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Agent node token is available at /var/lib/rancher/k3s/server/agent-token" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="To join agent node to cluster: k3s agent -s https://:6443 -t ${AGENT_NODE_TOKEN}" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Wrote kubeconfig /etc/rancher/k3s/k3s.yaml" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Run: k3s kubectl" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="certificate CN=k3s,O=k3s signed by CN=k3s-server-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 08:15:25 +0000 UTC" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=warning msg="dynamiclistener [::]:6443: no cached certificate available for preload - deferring certificate load until storage initialization or first client request" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="certificate CN=k3s,O=k3s signed by CN=k3s-server-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 08:15:25 +0000 UTC" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Active TLS secret / (ver=) (count 10): map[listener.cattle.io/cn-10.0.1.1:10.0.1.1 listener.cattle.io/cn-10.2.0.1:10.2.0.1 listener.cattle.io/cn-127.0.0.1:127.0.0.1 listener.cattle.io/cn-__1-f16284:::1 listener.cattle.io/cn-core-control-plane-0:core-control-plane-0 listener.cattle.io/cn-kubernetes:kubernetes listener.cattle.io/cn-kubernetes.default:kubernetes.default listener.cattle.io/cn-kubernetes.default.svc:kubernetes.default.svc listener.cattle.io/cn-kubernetes.default.svc.cluster.local:kubernetes.default.svc.cluster.local listener.cattle.io/cn-localhost:localhost listener.cattle.io/fingerprint:SHA1=B55A7BB8500726B769372D48CFE43CA048E90E22]" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Active TLS secret / (ver=) (count 10): map[listener.cattle.io/cn-10.0.1.1:10.0.1.1 listener.cattle.io/cn-10.2.0.1:10.2.0.1 listener.cattle.io/cn-127.0.0.1:127.0.0.1 listener.cattle.io/cn-__1-f16284:::1 listener.cattle.io/cn-core-control-plane-0:core-control-plane-0 listener.cattle.io/cn-kubernetes:kubernetes listener.cattle.io/cn-kubernetes.default:kubernetes.default listener.cattle.io/cn-kubernetes.default.svc:kubernetes.default.svc listener.cattle.io/cn-kubernetes.default.svc.cluster.local:kubernetes.default.svc.cluster.local listener.cattle.io/cn-localhost:localhost listener.cattle.io/fingerprint:SHA1=A90D7F4E8F04285FFF7B319CCB0AB35CDEF699B9]" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="certificate CN=core-control-plane-0 signed by CN=k3s-server-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 08:15:25 +0000 UTC" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="certificate CN=system:node:core-control-plane-0,O=system:nodes signed by CN=k3s-client-ca@1633418218: notBefore=2021-10-05 07:16:58 +0000 UTC notAfter=2024-01-03 08:15:25 +0000 UTC" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Module overlay was already loaded" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Module br_netfilter was already loaded" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Set sysctl 'net/ipv4/conf/all/forwarding' to 1" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Set sysctl 'net/netfilter/nf_conntrack_max' to 131072" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=warning msg="The wireguard backend is deprecated and will be removed in k3s v1.26, please switch to wireguard-native. Check our docs for information about how to migrate." Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Logging containerd to /var/lib/rancher/k3s/agent/containerd/containerd.log" Jan 03 08:15:25 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:25Z" level=info msg="Running containerd -c /var/lib/rancher/k3s/agent/etc/containerd/config.toml -a /run/k3s/containerd/containerd.sock --state /run/k3s/containerd --root /var/lib/rancher/k3s/agent/containerd" Jan 03 08:15:26 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:26Z" level=info msg="Containerd is now running" Jan 03 08:15:26 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:26Z" level=info msg="Running kubelet --address=0.0.0.0 --anonymous-auth=false --authentication-token-webhook=true --authorization-mode=Webhook --cgroup-driver=systemd --client-ca-file=/var/lib/rancher/k3s/agent/client-ca.crt --cloud-provider=external --cluster-dns=10.2.0.10 --cluster-domain=cluster.local --cni-bin-dir=/var/lib/rancher/k3s/data/8337c3c448159e502751ad88e10cd3bb2d8572cf28d9351bef216d68aaed3413/bin --cni-conf-dir=/var/lib/rancher/k3s/agent/etc/cni/net.d --container-runtime=remote --container-runtime-endpoint=unix:///run/k3s/containerd/containerd.sock --containerd=/run/k3s/containerd/containerd.sock --eviction-hard=imagefs.available<5%,nodefs.available<5% --eviction-minimum-reclaim=imagefs.available=10%,nodefs.available=10% --fail-swap-on=false --healthz-bind-address=127.0.0.1 --hostname-override=core-control-plane-0 --kubeconfig=/var/lib/rancher/k3s/agent/kubelet.kubeconfig --node-ip=10.0.1.1 --node-labels= --pod-manifest-path=/var/lib/rancher/k3s/agent/pod-manifests --read-only-port=0 --resolv-conf=/run/systemd/resolve/resolv.conf --serialize-image-pulls=false --tls-cert-file=/var/lib/rancher/k3s/agent/serving-kubelet.crt --tls-private-key-file=/var/lib/rancher/k3s/agent/serving-kubelet.key" Jan 03 08:15:26 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:26Z" level=info msg="Connecting to proxy" url="wss://127.0.0.1:6443/v1-k3s/connect" Jan 03 08:15:26 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:26Z" level=info msg="Handling backend connection request [core-control-plane-0]" Jan 03 08:15:26 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:26Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:6443/v1-k3s/readyz: 500 Internal Server Error" Jan 03 08:15:26 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:26Z" level=info msg="Adding member core-control-plane-0-898acd78=https://10.0.1.1:2380 to etcd cluster [core-control-plane-1-11456e80=https://23.88.123.167:2380 core-control-plane-2-9d063b64=https://167.235.19.84:2380]" Jan 03 08:15:26 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:26Z" level=info msg="Starting etcd to join cluster with members [core-control-plane-1-11456e80=https://23.88.123.167:2380 core-control-plane-2-9d063b64=https://167.235.19.84:2380 core-control-plane-0-898acd78=https://10.0.1.1:2380]" Jan 03 08:15:26 core-control-plane-0 k3s[1773]: {"level":"info","ts":"2023-01-03T08:15:26.796Z","caller":"embed/etcd.go:131","msg":"configuring peer listeners","listen-peer-urls":["https://10.0.1.1:2380","https://127.0.0.1:2380"]} Jan 03 08:15:26 core-control-plane-0 k3s[1773]: {"level":"info","ts":"2023-01-03T08:15:26.796Z","caller":"embed/etcd.go:479","msg":"starting with peer TLS","tls-info":"cert = /var/lib/rancher/k3s/server/tls/etcd/peer-server-client.crt, key = /var/lib/rancher/k3s/server/tls/etcd/peer-server-client.key, client-cert=, client-key=, trusted-ca = /var/lib/rancher/k3s/server/tls/etcd/peer-ca.crt, client-cert-auth = true, crl-file = ","cipher-suites":[]} Jan 03 08:15:26 core-control-plane-0 k3s[1773]: {"level":"info","ts":"2023-01-03T08:15:26.797Z","caller":"embed/etcd.go:139","msg":"configuring client listeners","listen-client-urls":["https://10.0.1.1:2379","https://127.0.0.1:2379"]} Jan 03 08:15:26 core-control-plane-0 k3s[1773]: {"level":"info","ts":"2023-01-03T08:15:26.798Z","caller":"embed/etcd.go:308","msg":"starting an etcd server","etcd-version":"3.5.4","git-sha":"Not provided (use ./build instead of go build)","go-version":"go1.17.13","go-os":"linux","go-arch":"amd64","max-cpu-set":2,"max-cpu-available":2,"member-initialized":false,"name":"core-control-plane-0-898acd78","data-dir":"/var/lib/rancher/k3s/server/db/etcd","wal-dir":"","wal-dir-dedicated":"","member-dir":"/var/lib/rancher/k3s/server/db/etcd/member","force-new-cluster":false,"heartbeat-interval":"500ms","election-timeout":"5s","initial-election-tick-advance":true,"snapshot-count":10000,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://localhost:2380"],"listen-peer-urls":["https://10.0.1.1:2380","https://127.0.0.1:2380"],"advertise-client-urls":["https://10.0.1.1:2379"],"listen-client-urls":["https://10.0.1.1:2379","https://127.0.0.1:2379"],"listen-metrics-urls":["http://127.0.0.1:2381"],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"core-control-plane-0-898acd78=https://10.0.1.1:2380,core-control-plane-1-11456e80=https://23.88.123.167:2380,core-control-plane-2-9d063b64=https://167.235.19.84:2380","initial-cluster-state":"existing","initial-cluster-token":"etcd-cluster","quota-size-bytes":2147483648,"pre-vote":true,"initial-corrupt-check":true,"corrupt-check-time-interval":"0s","auto-compaction-mode":"","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s"} Jan 03 08:15:26 core-control-plane-0 k3s[1773]: {"level":"info","ts":"2023-01-03T08:15:26.803Z","caller":"etcdserver/backend.go:81","msg":"opened backend db","path":"/var/lib/rancher/k3s/server/db/etcd/member/snap/db","took":"4.819977ms"} Jan 03 08:15:30 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:30Z" level=info msg="Tunnel server egress proxy waiting for runtime core to become available" Jan 03 08:15:31 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:31Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:6443/v1-k3s/readyz: 500 Internal Server Error" Jan 03 08:15:31 core-control-plane-0 k3s[1773]: {"level":"warn","ts":"2023-01-03T08:15:31.816Z","caller":"embed/config_logging.go:160","msg":"rejected connection","remote-addr":"10.0.1.2:39814","server-name":"","ip-addresses":["127.0.0.1","::1","23.88.123.167","2a01:4f8:1c17:c664::1","10.2.0.1"],"dns-names":["localhost","core-control-plane-1"],"error":"tls: \"10.0.1.2\" does not match any of DNSNames [\"localhost\" \"core-control-plane-1\"] (lookup core-control-plane-1: Try again)"} Jan 03 08:15:31 core-control-plane-0 k3s[1773]: {"level":"warn","ts":"2023-01-03T08:15:31.819Z","caller":"etcdserver/cluster_util.go:79","msg":"failed to get cluster response","address":"https://167.235.19.84:2380/members","error":"Get \"https://167.235.19.84:2380/members\": EOF"} Jan 03 08:15:31 core-control-plane-0 k3s[1773]: {"level":"warn","ts":"2023-01-03T08:15:31.832Z","caller":"embed/config_logging.go:160","msg":"rejected connection","remote-addr":"10.0.1.2:39812","server-name":"","ip-addresses":["127.0.0.1","::1","23.88.123.167","2a01:4f8:1c17:c664::1","10.2.0.1"],"dns-names":["localhost","core-control-plane-1"],"error":"tls: \"10.0.1.2\" does not match any of DNSNames [\"localhost\" \"core-control-plane-1\"] (lookup core-control-plane-1: Try again)"} Jan 03 08:15:31 core-control-plane-0 k3s[1773]: {"level":"warn","ts":"2023-01-03T08:15:31.832Z","caller":"embed/config_logging.go:160","msg":"rejected connection","remote-addr":"10.0.1.2:39798","server-name":"","ip-addresses":["127.0.0.1","::1","23.88.123.167","2a01:4f8:1c17:c664::1","10.2.0.1"],"dns-names":["localhost","core-control-plane-1"],"error":"tls: \"10.0.1.2\" does not match any of DNSNames [\"localhost\" \"core-control-plane-1\"] (lookup core-control-plane-1: Try again)"} Jan 03 08:15:31 core-control-plane-0 k3s[1773]: {"level":"warn","ts":"2023-01-03T08:15:31.951Z","caller":"embed/config_logging.go:160","msg":"rejected connection","remote-addr":"10.0.1.2:39826","server-name":"","ip-addresses":["127.0.0.1","::1","23.88.123.167","2a01:4f8:1c17:c664::1","10.2.0.1"],"dns-names":["localhost","core-control-plane-1"],"error":"tls: \"10.0.1.2\" does not match any of DNSNames [\"localhost\" \"core-control-plane-1\"] (lookup core-control-plane-1: Try again)"} Jan 03 08:15:32 core-control-plane-0 k3s[1773]: {"level":"warn","ts":"2023-01-03T08:15:32.073Z","caller":"embed/config_logging.go:160","msg":"rejected connection","remote-addr":"10.0.1.2:39828","server-name":"","ip-addresses":["127.0.0.1","::1","23.88.123.167","2a01:4f8:1c17:c664::1","10.2.0.1"],"dns-names":["localhost","core-control-plane-1"],"error":"tls: \"10.0.1.2\" does not match any of DNSNames [\"localhost\" \"core-control-plane-1\"] (lookup core-control-plane-1: Try again)"} Jan 03 08:15:32 core-control-plane-0 k3s[1773]: {"level":"warn","ts":"2023-01-03T08:15:32.083Z","caller":"embed/config_logging.go:160","msg":"rejected connection","remote-addr":"10.0.1.2:39838","server-name":"","ip-addresses":["127.0.0.1","::1","23.88.123.167","2a01:4f8:1c17:c664::1","10.2.0.1"],"dns-names":["localhost","core-control-plane-1"],"error":"tls: \"10.0.1.2\" does not match any of DNSNames [\"localhost\" \"core-control-plane-1\"] (lookup core-control-plane-1: Try again)"} Jan 03 08:15:35 core-control-plane-0 k3s[1773]: {"level":"warn","ts":"2023-01-03T08:15:35.020Z","caller":"embed/config_logging.go:160","msg":"rejected connection","remote-addr":"10.0.1.2:39840","server-name":"","ip-addresses":["127.0.0.1","::1","23.88.123.167","2a01:4f8:1c17:c664::1","10.2.0.1"],"dns-names":["localhost","core-control-plane-1"],"error":"tls: \"10.0.1.2\" does not match any of DNSNames [\"localhost\" \"core-control-plane-1\"] (lookup core-control-plane-1: Try again)"} Jan 03 08:15:35 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:35Z" level=info msg="Tunnel server egress proxy waiting for runtime core to become available" Jan 03 08:15:35 core-control-plane-0 k3s[1773]: {"level":"warn","ts":"2023-01-03T08:15:35.822Z","caller":"embed/config_logging.go:160","msg":"rejected connection","remote-addr":"10.0.1.2:39850","server-name":"","ip-addresses":["127.0.0.1","::1","23.88.123.167","2a01:4f8:1c17:c664::1","10.2.0.1"],"dns-names":["localhost","core-control-plane-1"],"error":"tls: \"10.0.1.2\" does not match any of DNSNames [\"localhost\" \"core-control-plane-1\"] (lookup core-control-plane-1: Try again)"} Jan 03 08:15:35 core-control-plane-0 k3s[1773]: {"level":"warn","ts":"2023-01-03T08:15:35.834Z","caller":"embed/config_logging.go:160","msg":"rejected connection","remote-addr":"10.0.1.2:39856","server-name":"","ip-addresses":["127.0.0.1","::1","23.88.123.167","2a01:4f8:1c17:c664::1","10.2.0.1"],"dns-names":["localhost","core-control-plane-1"],"error":"tls: \"10.0.1.2\" does not match any of DNSNames [\"localhost\" \"core-control-plane-1\"] (lookup core-control-plane-1: Try again)"} Jan 03 08:15:36 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:36Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:6443/v1-k3s/readyz: 500 Internal Server Error" Jan 03 08:15:36 core-control-plane-0 k3s[1773]: {"level":"warn","ts":"2023-01-03T08:15:36.840Z","caller":"embed/config_logging.go:160","msg":"rejected connection","remote-addr":"10.0.1.2:39858","server-name":"","ip-addresses":["127.0.0.1","::1","23.88.123.167","2a01:4f8:1c17:c664::1","10.2.0.1"],"dns-names":["localhost","core-control-plane-1"],"error":"tls: \"10.0.1.2\" does not match any of DNSNames [\"localhost\" \"core-control-plane-1\"] (lookup core-control-plane-1: Try again)"} Jan 03 08:15:36 core-control-plane-0 k3s[1773]: {"level":"warn","ts":"2023-01-03T08:15:36.840Z","caller":"etcdserver/cluster_util.go:79","msg":"failed to get cluster response","address":"https://23.88.123.167:2380/members","error":"Get \"https://23.88.123.167:2380/members\": EOF"} Jan 03 08:15:36 core-control-plane-0 k3s[1773]: {"level":"info","ts":"2023-01-03T08:15:36.842Z","caller":"embed/etcd.go:368","msg":"closing etcd server","name":"core-control-plane-0-898acd78","data-dir":"/var/lib/rancher/k3s/server/db/etcd","advertise-peer-urls":["http://localhost:2380"],"advertise-client-urls":["https://10.0.1.1:2379"]} Jan 03 08:15:36 core-control-plane-0 k3s[1773]: {"level":"info","ts":"2023-01-03T08:15:36.843Z","caller":"embed/etcd.go:370","msg":"closed etcd server","name":"core-control-plane-0-898acd78","data-dir":"/var/lib/rancher/k3s/server/db/etcd","advertise-peer-urls":["http://localhost:2380"],"advertise-client-urls":["https://10.0.1.1:2379"]} Jan 03 08:15:36 core-control-plane-0 k3s[1773]: time="2023-01-03T08:15:36Z" level=fatal msg="ETCD join failed: cannot fetch cluster info from peer urls: could not retrieve cluster information from the given URLs" ```

The interesting part on control-plane-1 logs looks like that:

Jan 03 08:23:39 core-control-plane-1 k3s[1679]: {"level":"warn","ts":"2023-01-03T08:23:39.834Z","caller":"embed/config_logging.go:160","msg":"rejected connection","remote-addr":"<public-ipv4(of-cp0)>:41662","server-name":"","ip-addresses":["10.0.1.1","127.0.0.1","::1","10.0.1.1","10.2.0.1"],"dns-names":["localhost","core-control-plane-0"],"error":"tls: \"<public-ipv4(of-cp0)>\" does not match any of DNSNames [\"localhost\" \"core-control-plane-0\"] (lookup core-control-plane-0: Try again)"}

So I wonder why does the control-plane-0 register in the etcd with its public-ipv4. I found this comment which says it should not be the case https://github.com/k3s-io/k3s/issues/2533#issuecomment-748387146

So I guess the setup somehow things that the public IPs are the private ones?

toabi commented 1 year ago

Adding --node-external-ip

This moves the error to control-plane-0 which now logs:

Jan 03 10:46:19 core-control-plane-0 k3s[20099]: {"level":"warn","ts":"2023-01-03T10:46:19.875Z","caller":"embed/config_logging.go:160","msg":"rejected connection","remote-addr":"10.0.1.2:43660","server-name":"","ip-addresses":["127.0.0.1","::1","<public-ipv4-cp1","<ipv6-cp1>","10.2.0.1"],"dns-names":["localhost","core-control-plane-1"],"error":"tls: \"10.0.1.2\" does not match any of DNSNames [\"localhost\" \"core-control-plane-1\"] (lookup core-control-plane-1: Try again)"}

while control-plane-1 says:

Jan 03 10:44:45 core-control-plane-1 k3s[1679]: {"level":"warn","ts":"2023-01-03T10:44:45.455Z","caller":"etcdserver/cluster_util.go:288","msg":"failed to reach the peer URL","address":"https://10.0.1.1:2380/version","remote-member-id":"8aa818d1fa6327ef","error":"Get \"https://10.0.1.1:2380/version\": dial tcp 10.0.1.1:2380: connect: connection refused"}

So it looks like taking all control-plane hosts down and defining additionally the public-ip via commandline arguments would maybe be a solution.

What makes this harder is that the public-ip is currently now known during server creation and therefore the snipped in the cloud-init file.

Otherwise using the hostnames instead of the IPv4 would also be a solution, but that would mean putting some DNS server somewhere and somehow configuring k3s to use those hostnames instead?

toabi commented 1 year ago

While researching more I think https://github.com/k3s-io/k3s/issues/3551#issuecomment-970122524 is similar. Now the question would be how to actually migrate existing clusters to use all those arguments…

toabi commented 1 year ago

I tried to migrate the current setup to the new setup by stopping 2 of 3 control-plane nodes, then changing the commandline args on the last one for using node-ip, advertise-ip and node-external-ip. It came up but was not happy/healthy.

Sadly control-plane-0 (also with node-ip/advertise-ip/external could not join the cluster anymore with error:

Jan 03 13:42:29 core-control-plane-0 k3s[1747]: time="2023-01-03T13:42:29Z" level=fatal msg="ETCD join failed: error validating peerURLs {ClusterID:d7e48328fbade5a9 Members:[&{ID:378283a013db6ca0 RaftAttributes:{PeerURLs:[https://<ipv4-cp-1>:2380] IsLearner:false} Attributes:{Name:core-control-plane-1-11456e80 ClientURLs:[https://<ipv4-cp-1>:2379]}} &{ID:495f61aec428708b RaftAttributes:{PeerURLs:[https://<ipv4-cp2>:2380] IsLearner:false} Attributes:{Name:core-control-plane-2-9d063b64 ClientURLs:[https://<ipv4-cp2>:2379]}}] RemovedMemberIDs:[]}: PeerURLs: no match found for existing member (378283a013db6ca0, [https://<ipv4-cp-1>:2380]), last resolver error (\"https://<ipv4-cp-1>:2380\"(resolved from \"https://<ipv4-cp-1>:2380\") != \"https://10.0.1.1:2380\"(resolved from \"https://10.0.1.1:2380\"))"

So it looks like cp-1 was still part of the etcd and did not get removed… but before cp-0 can join the etcd again it needs to be in a healthy state.

toabi commented 1 year ago

Next thing I'll probably try is to completely kill the control plane and restore a etcd backup and then try to make it HA again.

brandond commented 1 year ago

K3s issues kubelet certificates using the addresses that it is aware of when the node starts up. If you have something like the hetzner cloud provider that is adding or changing the node addresses while initializing the node, the kubelet cert obviously won't be valid for those, as they were not known at the time the certificate was requested. You should ensure that all the addresses are provided as --node-ip or --node-external-ip so that the SANs on the certs will match the addresses set by the cloud provider.

toabi commented 1 year ago

Jeah well, it somehow all worked with k3s 1.21 apparently because the setup around did not change. Something in 1.22+ changed how IP adresses are used.

But I'll try to migrate the setup to one which explicitly sets the ip addresses via cli params. The only hurdle currently is the migration of the etcd but I'll try out changing it to a single-node one using --cluster-reset next.

toabi commented 1 year ago

Okay, so I tested and played around with this a lot now. Good thing is that k3s is not too complex.

Migration strategy from "wrong" IP-setup with HA control plane looks roughly like that:

In the end etcd should talk via private IPs and the certs will also contain the private IPs.