Open vast0906 opened 4 months ago
Hey again! In your proposal, you are talking about server-client communication, where the client knows the endpoint of the server but the server only knows the public-key of the client. In this scenario, client can communicate with the server but the server can't communicate with client until the client contacts first, right?
The problem with the previous approach with Kubernetes is that the architecture is not a server-client when it comes to pod-pod communication. We are creating a mesh of tunnels between the nodes. Imagine a cluster of 3 nodes (node1, node2 and node3), I see for example two problems: 1 - When node3 comes up, should it know the endpoint of node1 and node2? Or only node1? How to decide on that? 2 - Imagine it knows the endpoint of both node1 and node2. But node1 and node2 don't know the endpoint of node3. If I understand correctly, node1 and node2 can't communicate with node3 unless node3 tries to communicate with them. That means that pods in node1 and node2 won't be able to contact node3 pods, right?
client can communicate with the server but the server can't communicate with client until the client contacts first, right?
yes
server-client and pod-pod No conflict. The pod-pod network is a tunnel created through server-client. Pod-pod can communicate only after server-client establishes a connection and creates a tunnel.
client can communicate with the server but the server can't communicate with client until the client contacts first, right?
client can communicate with the server but the server can't communicate with client until the client contacts first, right?
yes
server-client and pod-pod No conflict. The pod-pod network is a tunnel created through server-client. Pod-pod can communicate only after server-client establishes a connection and creates a tunnel.
Right, but the server needs to wait for the client to contact it. What if the client never contacts the server?
Right, but the server needs to wait for the client to contact it. What if the client never contacts the server?
WIREGUARD contacts the server when it starts up, if client never contacts the server , Represents this node is not ready
Right, but the server needs to wait for the client to contact it. What if the client never contacts the server?
WIREGUARD contacts the server when it starts up, if client never contacts the server , Represents this node is not ready
Imagine we have 2 nodes. 1 node is the k8s control-plane and 1 node is the k8s agent and it is behind a NAT (let's call it node1). In this case, I can see your suggestion working.
However, what happens if we add a new k8s agent node behing a NAT (let's call it node2)? We need to know the endpoint of node1 or node2 to create that tunnel between both nodes, right?
Imagine we have 2 nodes. 1 node is the k8s control-plane and 1 node is the k8s agent and it is behind a NAT (let's call it node1). In this case, I can see your suggestion working.
However, what happens if we add a new k8s agent node behing a NAT (let's call it node2)? We need to know the endpoint of node1 or node2 to create that tunnel between both nodes, right?
I'm not sure if the wireguard master will synchronize all endpoint information to the other node
Cluster Configuration:
server:
node:
node-x86 node-x86 is NAT'd and doesn't know its IP address. EXTERNAL-IP: xx.xx.xx.yy INTERNAL-IP: 192.168.36.22
node-arm EXTERNAL-IP: xx.xx.xx.zz INTERNAL-IP: 10.0.1.217
node-x86 configuration
node-arm configuration
master wg show
peer: hldi2xxx endpoint: xx.xx.xx.zz:51820 allowed ips: 10.42.2.0/24 latest handshake: 28 seconds ago transfer: 1.52 KiB received, 3.16 KiB sent persistent keepalive: every 25 seconds
peer: Ww7xx endpoint: xx.xx.xx.xx:51820 allowed ips: 10.42.0.0/24 transfer: 0 B received, 30.06 KiB sent persistent keepalive: every 25 seconds
interface: flannel-wg public key: hldi26xxxx private key: (hidden) listening port: 51820
peer: Ww7xxxx endpoint: xx.xx.xx.xx:51820 allowed ips: 10.42.0.0/24 latest handshake: 8 seconds ago transfer: 6.53 MiB received, 15.16 MiB sent persistent keepalive: every 25 seconds
peer: Ap//xxxx endpoint: xx.xx.xx.yy:8598 # that's right allowed ips: 10.42.5.0/24 latest handshake: 1 minute, 12 seconds ago transfer: 2.86 KiB received, 2.04 KiB sent persistent keepalive: every 25 seconds
wg show flannel-wg
interface: flannel-wg public key: Wxxxx private key: (hidden) listening port: 51820
peer: hldi2xxxx endpoint: xx.xx.xx.zz:51820 allowed ips: 10.42.2.0/24 latest handshake: 25 seconds ago transfer: 11.72 MiB received, 6.53 MiB sent persistent keepalive: every 25 seconds
peer: Ap//Dxxx endpoint: xx.xx.xx.yy:8598 # that's right allowed ips: 10.42.5.0/24 transfer: 0 B received, 33.39 KiB sent persistent keepalive: every 25 seconds
wg show flannel-wg
interface: flannel-wg public key: Wxxxx private key: (hidden) listening port: 51820
peer: hldi2xxxx endpoint: xx.xx.xx.zz:51820 allowed ips: 10.42.2.0/24 latest handshake: 25 seconds ago transfer: 11.72 MiB received, 6.53 MiB sent persistent keepalive: every 25 seconds
peer: Ap//Dxxx endpoint: 192.168.36.22:51820 # It's wrong allowed ips: 10.42.5.0/24 transfer: 0 B received, 33.39 KiB sent persistent keepalive: every 25 seconds