Open NandoTheessen opened 3 weeks ago
Did you set --node-ip and --node-external-ip to the correct values for each of the agents, or just the servers?
Based on the information you shared, it sounds like the apiserver is trying to connect to the kubelet's external IP to get logs. Normally it would connect to the public IP using the agent tunnel, so I suspect that the internal and external IPs are not being set properly.
Thanks for the help Brandon!
For the servers, node-ip defaults to the private IP (I believed) which would be 192.x.x.. The agents don't have these set as they only have public IP addresses which are used as the nodes IP addresses.
Should I set node-ip and node-external-ip specifically to their public addresses?
This is the current setup
Server node 1: node-ip not set, node-external-ip set to NAT gateway
Server node 2: node-ip not set, node-external-ip set to NAT gateway
Server node 3: node-ip not set, node-external-ip set to NAT gateway
Agent 1: node-ip, node-external-ip not set but only public iP
Agent 2: node-ip, node-external-ip not set but only public iP
Agent 3: node-ip, node-external-ip not set but only public iP
Related to #7355 I think. Base on the comment https://github.com/k3s-io/k3s/issues/7355#issuecomment-1523635066, I'm still unable to get it working right.
Thanks for linking that issue @tdtgit ! I don't think it is the same, but it helped me identify the issue a little bit better. I'm not entirely sure if what I'm trying to achieve is even possible mind you, concretely this is where I'm doubtful:
Since I only have one NAT gateway, I only have one public IP address. So this is my server config:
MY_EXTERNAL_IP=80.xxx.xxx.xxx
server 1: --node-ip 192.168.88.2 --node-external-ip ${MY_EXTERNAL_IP} --flannel-external-ip
server 2: --node-ip <internal-ip> --node-external-ip ${MY_EXTERNAL_IP} --flannel-external-ip
server 3: --node-ip <internal-ip> --node-external-ip ${MY_EXTERNAL_IP} --flannel-external-ip
My agent config:
agent 1: --node-ip 80.xxx.xxx.xxx --node-external-ip 80.xxx.xxx.xxx --server https://80.xxx.xxx.xxx:6443
agent 2: --node-ip 80.xxx.xxx.xxx --node-external-ip 80.xxx.xxx.xxx --server https://80.xxx.xxx.xxx:6443
agent 3: --node-ip 80.xxx.xxx.xxx --node-external-ip 80.xxx.xxx.xxx --server https://80.xxx.xxx.xxx:6443
Here is some additional information:
wg show
I get this output (server 3):interface: flannel-wg
public key: xxxx
private key: (hidden)
listening port: 51820
peer: xxxx
endpoint: 80.xxx.xxx.xxx:51820
allowed ips: 10.42.3.0/24
latest handshake: 41 seconds ago
transfer: 1.96 KiB received, 764 B sent
persistent keepalive: every 25 seconds
peer: xxxx
endpoint: 80.xxx.xxx.xxx:51820
allowed ips: 10.42.4.0/24
latest handshake: 1 minute, 27 seconds ago
transfer: 1.23 KiB received, 1.27 KiB sent
persistent keepalive: every 25 seconds
peer: xxxx
endpoint: 80.xxx.xxx.xxx:51820
allowed ips: 10.42.5.0/24
latest handshake: 1 minute, 38 seconds ago
transfer: 556.30 KiB received, 282.04 KiB sent
persistent keepalive: every 25 seconds
peer: xxxx
endpoint: 80.xxx.xxx.xxx:51820
allowed ips: 10.42.0.0/24
transfer: 0 B received, 17.63 KiB sent
persistent keepalive: every 25 seconds
What I can see from this is, is that server three has only peered with one other server instead of two! We're missing a peer here and I assume that is related to the NAT gateway that forwards all traffic to one server (server 1).
I've indeed managed to fix this by assigning public IPs to all of my servers. I have one last issue that persists though.
I can read the logs from all pods except the ones on server2 & server3
, there I receive a "502: bad gateway" error.
I'm sure this was spotted before in the wild, could you give me some pointers?
Environmental Info: K3s Version:
k3s version v1.29.3+k3s1 (8aecc26b) go version go1.21.8
Node(s) CPU architecture, OS, and Version: Arm 64 Amd 64 Ubuntu 22.04
Cluster Configuration: 3 servers, software defined networking behind NAT with public IP 3 public agents
flannel backend wireguard-native, ports are allowed in NAT and nodes servers are tagged with external IP flag which is set to the NATs IP
Describe the bug: Multiple:
Pods from the agent nodes can't reach pods on the server nodes. Can't get logs from server nodes due to :
➜ ~ kubectl -n kube-system logs metallb-speaker-gnc5j Error from server: Get "https://<public-ip-nat>:10250/containerLogs/kube-system/metallb-speaker-gnc5j/metallb-speaker": proxy error from 127.0.0.1:6443 while dialing <public-ip-nat>:10250, code 502: 502 Bad Gateway
Steps To Reproduce:
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="server" sh -s - --flannel-backend wireguard-native --token <token> --disable servicelb --write-kubeconfig-mode 644 --node-external-ip <public-ip-nat> --flannel-external-ip --disable traefik
Installed a helm chart for metallb, and a daemonset for speakers is deployedExpected behavior: The pods are able to communicate with each other, I'm able to get logs from all pods.
Actual behavior: As described in "bug", pods can't speak to each other and I can't get logs from pods on the server nodes
Additional context / logs: Which logs would help? Happy to supply whatever is needed