bryopsida / wireguard-chart

A helm chart for wireguard
33 stars 22 forks source link

Novice Question - How to create charts in k8s and use wg client on macos to use them,and Only Route k8s internal domain name traffic #54

Open freshgeek opened 1 month ago

freshgeek commented 1 month ago

Thank you for @bryopsida chats, which allow me to directly deploy wg services with just one click. I look forward to your reply

As title: How to create charts in k8s and use wg client on macos to use them,and Only Route k8s internal domain name traffic

THIS is my values.yaml

wireguard:
  clients:
    - FriendlyName: cc
      PublicKey: tlJXcOQXVigzmEmyMEna3TNLqXAwFeEFD10P6NvYFRE=
      AllowedIPs: 192.168.10.1/24
  serverAddress: 192.168.0.1/16
  serverCidr: 192.168.0.0/16
replicaCount: 2
autoscaling:
  minReplicas: 2
  maxReplicas: 10

k8s server (alicloud ack ) cidr is 192.168.0.0/16 serverAddress is Randomly set subnets 192.168.0.1/16 clients PublicKey is macos client generate ,AllowedIPs is Randomly set subnets 192.168.10.1/24

my client (macos appstore download)config is imitate your example https://github.com/bryopsida/wireguard-chart?tab=readme-ov-file#example-tunnel-configurations

[Interface]
PrivateKey = mGmJINaT0IYddr6X+0Qfyyr1OQCzouI/ReyaRTmmclc=
Address = 192.168.10.1/24
DNS = 192.168.0.10

[Peer]
PublicKey = olf180XbonwpKVncLkCxLrbjazo878h5es2Tc3EQZn4=
AllowedIPs = 0.0.0.0/0
Endpoint = 47.237.109.24:51820

Interface: PrivateKey is mac os generate Address is same to server config AllowedIPs dns is k8s kube dns clusterIp type service ip ,

image

DNS

Peer: PublicKey is wg server pod console printf copy AllowedIPs temp allow all (Actually, I only want the internal network service to be accessed through wg, while the rest go through normal traffic) Endpoint is wg service loadbalance generate

this k8s is demo cluster ,so im not mask .

expected ,Just like you, im dig mysql.default.svc.cluster.local is can visited

actually , im dig get info:


% dig mysql.default.svc.cluster.local

; <<>> DiG 9.10.6 <<>> mysql.default.svc.cluster.local
;; global options: +cmd
;; connection timed out; no servers could be reached
bryopsida commented 1 month ago

@freshgeek at first glance your configuration looks valid, there are a few things at the cloud provider level that can impact functionality though.

1) Your cloud provider must support UDP load balancing to/from your load balancer IP: 47.237.109.24 2) The image/baseline being used on your nodes where the wireguard pod is running needs to have the wireguard linux kernel module available

To make troubleshooting easier while setting things up I'd recommend you scale to one pod

replicaCount: 1
autoscaling:
  enabled: false

This way you can enter a shell on that one pod and run sudo wg status to see if your client has completed a handshake with the server.

What network CNI are you using in your cluster and do you have any cluster wide network policies that might be blocking traffic?

If you are using cilium cni, can you use Hubble to see if the traffic flows are reaching the wireguard pod?

freshgeek commented 1 month ago

thanks for your @bryopsida replay, I will confirm immediately ~

  1. cloud provier is support udp lb is web console show , and tomorrow will to confirm from support personnel
  2. I don't quite understand,i attempt to ask ai , and use lsmod cmd in wg pod inner then back info , answer is ok:
    wg-wireguard-55474b6c66-899gt:~$ lsmod | grep wireguard
    wireguard              98304  0 
    libchacha20poly1305    16384  1 wireguard,[permanent]
    ip6_udp_tunnel         16384  1 wireguard
    udp_tunnel             28672  1 wireguard
    curve25519_x86_64      36864  1 wireguard
    libcurve25519_generic    49152  2 wireguard,curve25519_x86_64

AND cni is cluster default compoent terway , i will to attempt use hubble

bryopsida commented 1 month ago

i will to attempt use hubble I'm not familiar with terway, but as far as I know Hubble will only work with cilium. Maybe there's a similar component with terway that will confirm traffic is reaching the wireguard pod?

freshgeek commented 1 month ago

yep ,im run wg show is 23 seconds ago , im use mac os client to run , but client running my network is Not available :

wg-wireguard-55474b6c66-899gt:~$ sudo wg show
interface: wg0
  public key: olf180XbonwpKVncLkCxLrbjazo878h5es2Tc3EQZn4=
  private key: (hidden)
  listening port: 51820

peer: tlJXcOQXVigzmEmyMEna3TNLqXAwFeEFD10P6NvYFRE=
  endpoint: 182.105.116.243:26876
  allowed ips: 192.168.10.0/24
  latest handshake: 23 seconds ago
  transfer: 270.73 KiB received, 473.32 KiB sent
bryopsida commented 1 month ago

When you run the dig command dig mysql.default.svc.cluster.local do you see any logs in the kube-dns/core-dns pods related to your query?

And if not, does modifying the dig command to use a specific dns server make any difference. dig mysql.default.svc.cluster.local. @192.168.0.10

freshgeek commented 1 month ago

no related in 500 line with all kubedns pod .

and add cmd @192.168.0.10 also not log

bryopsida commented 1 month ago

Are there any network policies at the cluster level that could be blocking the connection from the wireguard pod to the kube-dns/coredns service?

bryopsida commented 1 month ago

And do you have any warnings in the wireguard pod logs?

A clean startup should look like this.

sysctls net.ipv4.ip_forward = 1                                                                                                                                                                                                                             
sysctls net.ipv4.conf.all.forwarding = 1                                                                                                                                                                                                                    
wireguard [#] ip link add wg0 type wireguard                                                                                                                                                                                                                
wireguard [#] wg setconf wg0 /dev/fd/63                                                                                                                                                                                                                     
wireguard [#] ip -4 address add <VPN SERVER CIDR> dev wg0                                                                                                                                                                                                      
wireguard [#] ip link set mtu 1420 up dev wg0                                                                                                                                                                                                               
wireguard [#] wg set wg0 private-key /etc/wireguard/privatekey && iptables -t nat -A POSTROUTING -s <VPN CIDR> -o eth0 -j MASQUERADE                                                                                                                    
wireguard Public key 'REDACTED' 
freshgeek commented 1 month ago

And do you have any warnings in the wireguard pod logs?

A clean startup should look like this.

sysctls net.ipv4.ip_forward = 1                                                                                                                                                                                                                             
sysctls net.ipv4.conf.all.forwarding = 1                                                                                                                                                                                                                    
wireguard [#] ip link add wg0 type wireguard                                                                                                                                                                                                                
wireguard [#] wg setconf wg0 /dev/fd/63                                                                                                                                                                                                                     
wireguard [#] ip -4 address add <VPN SERVER CIDR> dev wg0                                                                                                                                                                                                      
wireguard [#] ip link set mtu 1420 up dev wg0                                                                                                                                                                                                               
wireguard [#] wg set wg0 private-key /etc/wireguard/privatekey && iptables -t nat -A POSTROUTING -s <VPN CIDR> -o eth0 -j MASQUERADE                                                                                                                    
wireguard Public key 'REDACTED' 

see this log is normal ,this my pod log :

[#] ip link add wg0 type wireguard
[#] wg setconf wg0 /dev/fd/63
[#] ip -4 address add 192.168.0.1/16 dev wg0
[#] ip link set mtu 1420 up dev wg0
[#] wg set wg0 private-key /etc/wireguard/privatekey && iptables -t nat -A POSTROUTING -s 192.168.0.0/16 -o eth0 -j MASQUERADE
Public key 'olf180XbonwpKVncLkCxLrbjazo878h5es2Tc3EQZn4='
freshgeek commented 1 month ago

Are there any network policies at the cluster level that could be blocking the connection from the wireguard pod to the kube-dns/coredns service?

for simple , im set network policies disable (disable_network_policy is true ),should allow all

bryopsida commented 1 month ago

It looks like hubble is compatible with terway.

https://www.alibabacloud.com/help/en/ack/ack-managed-and-ack-dedicated/user-guide/implement-network-observability-by-using-ack-terway-and-cilium-hubble

Based on the logs, and the wireguard peer handshake status from the pod, it looks like the connection from your laptop -> WG pod is working but the dns query isn't reaching the coredns/kubedns pod that can resolve your internal mysql service address.

Hubble will let you confirm that your traffic is getting to the wireguard pod and show where the dns queries are being sent to and any forward/drop verdicts on the flows that might be impacting dns resolution.

Some steps you can do without Hubble to confirm connectivity to the dns service and wg connectivity

1) To test if the wireguard pod is allowed to connect to kube-dns you can run the following from a shell inside the wireguard pod.

image

In my case I do not have a mysql service in the default namespace so it doesn't resolve but the kubernetes default service does.

2) Pick a http service in the cluster, make note of the service name and ip, and use curl with the --resolve option from your laptop to confirm you are able to connect to the service through wireguard. If this works it isolates the issue to just connectivity to the kube dns service.

bryopsida commented 1 month ago

When you have the visual filters setup to show everything in Hubble like this.

image

You should see your laptops queries going to kube dns like this. image

Looking at your kubernetes cidr and wg cidr it looks like you might have overlap between the WG VPN cidr and the kubernetes pod cidr.

You want your wireguard VPN CIDR and kubernetes cider to be different segments, the wireguard pod will masquerade the traffic and this ensures you won't have any collisions with pod/service IPs.

So for example on a fresh k3s + cilium cluster I have the following kubernetes cidrs.

So for my wireguard cidr I use.

To ensure wireguard clients are on a different network segment.

In the above example to make sure all traffic goes through wireguard you use this in the tunnel

[Peer]
PublicKey = redacted
AllowedIPs = 0.0.0.0/0
Endpoint = redacted:51820

if you wanted to only have traffic for the wg cidr, service cidr and pod cidr sent over the vpn, you would set the following.

[Peer]
PublicKey = redacted
AllowedIPs = 10.42.0.0/16,10.43.0.0/16,172.32.32.0/24
Endpoint = redacted:51820
freshgeek commented 1 month ago

thank your replay,i will to confirm ~

freshgeek commented 1 month ago

It looks like hubble is compatible with terway.

https://www.alibabacloud.com/help/en/ack/ack-managed-and-ack-dedicated/user-guide/implement-network-observability-by-using-ack-terway-and-cilium-hubble

Based on the logs, and the wireguard peer handshake status from the pod, it looks like the connection from your laptop -> WG pod is working but the dns query isn't reaching the coredns/kubedns pod that can resolve your internal mysql service address.

Hubble will let you confirm that your traffic is getting to the wireguard pod and show where the dns queries are being sent to and any forward/drop verdicts on the flows that might be impacting dns resolution.

Some steps you can do without Hubble to confirm connectivity to the dns service and wg connectivity

  1. To test if the wireguard pod is allowed to connect to kube-dns you can run the following from a shell inside the wireguard pod.

image

In my case I do not have a mysql service in the default namespace so it doesn't resolve but the kubernetes default service does.

  1. Pick a http service in the cluster, make note of the service name and ip, and use curl with the --resolve option from your laptop to confirm you are able to connect to the service through wireguard. If this works it isolates the issue to just connectivity to the kube dns service.

im try to wg pod nslookup mysql svc 、kubernetes svc ,is timeout

wg-wireguard-5fb67d56d9-lq2j2:~$ nslookup mysql.default.svc.cluster.local
;; connection timed out; no servers could be reached

wg-wireguard-5fb67d56d9-lq2j2:~$ sudo ping 192.168.0.10
PING 192.168.0.10 (192.168.0.10): 56 data bytes

it is useful in my other service pod

root@u9skins-server-java-test-mdm1zd-55f687b5cc-rwzxb:/app# nslookup mysql.default.svc.cluster.local
Server:         192.168.0.10
Address:        192.168.0.10#53

Name:   mysql.default.svc.cluster.local
Address: 192.168.241.230

root@u9skins-server-java-test-mdm1zd-55f687b5cc-rwzxb:/app# ping 192.168.0.10
PING 192.168.0.10 (192.168.0.10): 56 data bytes
92 bytes from kube-dns.kube-system.svc.cluster.local (192.168.0.10): Destination Port Unreachable
^C--- 192.168.0.10 ping statistics ---
5 packets transmitted, 0 packets received, 100% packet loss

and im try to local book client curl that http service , like this : curl http://192.168.55.95:8080 , it no response ,maybe also timeout

freshgeek commented 1 month ago

When you have the visual filters setup to show everything in Hubble like this.

image

You should see your laptops queries going to kube dns like this. image

Looking at your kubernetes cidr and wg cidr it looks like you might have overlap between the WG VPN cidr and the kubernetes pod cidr.

You want your wireguard VPN CIDR and kubernetes cider to be different segments, the wireguard pod will masquerade the traffic and this ensures you won't have any collisions with pod/service IPs.

So for example on a fresh k3s + cilium cluster I have the following kubernetes cidrs.

  • 10.42.0.0/16 (pods)
  • 10.43.0.0/16 (services)

So for my wireguard cidr I use.

  • 172.32.32.0/24

To ensure wireguard clients are on a different network segment.

In the above example to make sure all traffic goes through wireguard you use this in the tunnel

[Peer]
PublicKey = redacted
AllowedIPs = 0.0.0.0/0
Endpoint = redacted:51820

if you wanted to only have traffic for the wg cidr, service cidr and pod cidr sent over the vpn, you would set the following.

[Peer]
PublicKey = redacted
AllowedIPs = 10.42.0.0/16,10.43.0.0/16,172.32.32.0/24
Endpoint = redacted:51820

in this step , alicloud ack not support Hubble by support

bryopsida commented 1 month ago

Based on the screenshot you sent with your kube-dns server address, am I correct to assume your kubernetes cluster pod and service cidr is 192.168.0.0/16?

If that's the case, can you try using a different cidr for the wireguard network?

Perhaps something like this.

wireguard:
  clients:
    - FriendlyName: cc
      PublicKey: tlJXcOQXVigzmEmyMEna3TNLqXAwFeEFD10P6NvYFRE=
      AllowedIPs: 172.32.32.2/32
  serverAddress: 172.32.32.1/24
  serverCidr: 172.32.32.0/24

You would need to adjust your tunnel definition on your laptop to use the new client ip as well. I think the overlapping segments is causing routing problems.

freshgeek commented 1 month ago

Perhaps something like this.

Thank your replay , I will try lately

And i use other resp is success

When I have free time later, I will compare the differences between the two for reference