Closed dmorn closed 4 months ago
See https://docs.k3s.io/networking/distributed-multicloud:
Embedded etcd is not supported in this type of deployment. If using embedded etcd, all server nodes must be reachable to each other via their private IPs. Agents may be distributed over multiple networks, but all servers should be in the same location.
All etcd nodes must be on the same private network.
Hi @brandond! They are.
OK, but can they reach each other at their private IPs? It appears they cannot based on your logs:
May 22 10:34:08 control-cax21-nbg1 k3s[210139]: {"level":"warn","ts":"2024-05-22T10:34:08.562614Z","logger":"etcd-client","caller":"v3@v3.5.9-k3s1/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x40007cb880/142.132.176.81:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
May 22 10:34:08 control-cax21-nbg1 k3s[210139]: time="2024-05-22T10:34:08Z" level=fatal msg="etcd cluster join failed: context deadline exceeded"
May 22 10:34:08 control-cax21-nbg1 systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
Are you using public IPs as the nodes private addresses?
Nope, that's the thing. I'm setting the node-external-ip and not the node-ip as the logs say that value is overridden by VPN configuration. Do I need to set the node-ip as well?
Do you have any idea why the nodes wouldn't be able to reach each other at the selected addresses? Do you have firewall rules or something else in place that is blocking the etcd traffic?
Yes I do have an idea. The nodes are trying to use the external address to comunicate and yes, that traffic is not allowed by firewall rules! Setting the node-ip in the previous sessions I tried didn't seem to help, but I would have to check it out again. The idea as I understand would be to
Environmental Info: K3s Version:
k3s version v1.29.4+k3s1 (94e29e2e) go version go1.21.9
Node(s) CPU architecture, OS, and Version: Linux control-cax21-fsn1 5.10.0-29-arm64 #1 SMP Debian 5.10.216-1 (2024-05-03) aarch64 GNU/Linux Linux control-cax21-nbg1 5.10.0-28-arm64 #1 SMP Debian 5.10.209-2 (2024-01-31) aarch64 GNU/Linu
Cluster Configuration:
To reproduce, 2 control nodes
Describe the bug:
I'm using the VPN feature. I can add agent nodes, but I cannot add server ones for an HA setup.
Steps To Reproduce:
Installed K3s on the master node:
Installed k3s on the second control node
Expected behavior:
I expect nodes to just join the cluster
Actual behavior:
The second control node keeps on crashing
Additional context / logs:
Logs from the second control plane node.