faisalbasha19 commented 1 year ago

Details

What steps did you take and what happened:

issue with client pods connecting to pod-gateway. I have setup pod-gateway properly but client_init.sh will not allocate an IP address for the vxlan0 interface automatically as it says failed to get lease.

My issue here is:

K8S_DNS_IP="$(cut -d ' ' -f 1 <<< "$K8S_DNS_IPS")" GATEWAY_IP="$(dig +short "$GATEWAY_NAME" "@${K8S_DNS_IP}")"

those above will not work if ip route del 0/0 || /bin/true has been executed before the above statements. When a pod is created I dont have 8.8.8.8 as the nameserver I have this:

ip route default via 169.254.1.1 dev eth0 169.254.1.1 dev eth0 scope link

cat /etc/resolv.conf nameserver 10.100.0.10 search namespace.svc.cluster.local .... options ndots:5

so if I was to delete the default gw after those statements here is what I get for these variables:

K8S_DNS_IP=10.100.0.10 GATEWAY_IP=10.100.37.21

now for the vxlan0 setup the second statement is bridge fdb append to 00:00:00:00:00:00 dst "$GATEWAY_IP" dev vxlan0

so for the above $GATEWAY_IP = 10.100.37.21 is this is what is expected for the GATEWAY_IP address?

and dhclient -v -cf /etc/dhclient.conf vxlan0

does not at all give me any ip addresses for the vxlan0 interface

so the final ping wont work to 172.16.0.1

Also the NAT_ENTRY is empty for me so no routes to 172.16.0.1 are added as default gw ? how should this work ?

dannypv05261 commented 1 year ago

I got the same error on this line when my container runs script client_init.sh for init container of pod that needs to access the Internet via the gateway. Hoping that it can be fixed. Thanks.

Here is the log

VXLAN_ID="42"
# VXLAN need an /24 IP range not conflicting with K8S and local IP ranges
VXLAN_IP_NETWORK="172.16.0"
# Keep a range of IPs for static assignment in nat.conf
VXLAN_GATEWAY_FIRST_DYNAMIC_IP=20
# If using a VPN, interface name created by it
VPN_INTERFACE=tun0
# Prevent non VPN traffic to leave the gateway
VPN_BLOCK_OTHER_TRAFFIC=true
# If VPN_BLOCK_OTHER_TRAFFIC is true, allow VPN traffic over this port
VPN_TRAFFIC_PORT=443
# Traffic to these IPs will be send through the K8S gateway
VPN_LOCAL_CIDRS="10.0.0.0/8 192.168.0.0/16"
# DNS queries to these domains will be resolved by K8S DNS instead of
# the default (typcally the VPN client changes it)
DNS_LOCAL_CIDRS="local"
# dnsmasq monitors directories. /etc/resolv.conf in a container is in another
# file system so it does not work. To circumvent this a copy is made using
# inotifyd
RESOLV_CONF_COPY=/etc/resolv_copy.conf
# ICMP heartbeats are used to ensure the pod-gateway is connectable from the clients.
# The following value can be used to to provide more stability in an unreliable network connection.
CONNECTION_RETRY_COUNT=1
# If you use nftables for iptables you need to set this to yes
IPTABLES_NFT=no
+ . /default_config/settings.sh
++ GATEWAY_NAME=pod-gateway.default.svc.cluster.local
++ K8S_DNS_IPS=10.43.0.10
++ NOT_ROUTED_TO_GATEWAY_CIDRS=
++ VXLAN_ID=42
++ VXLAN_IP_NETWORK=172.16.0
++ VXLAN_GATEWAY_FIRST_DYNAMIC_IP=20
++ VPN_INTERFACE=tun0
++ VPN_BLOCK_OTHER_TRAFFIC=true
++ VPN_TRAFFIC_PORT=443
++ VPN_LOCAL_CIDRS='10.0.0.0/8 192.168.0.0/16'
++ DNS_LOCAL_CIDRS=local
++ RESOLV_CONF_COPY=/etc/resolv_copy.conf
++ CONNECTION_RETRY_COUNT=1
++ IPTABLES_NFT=no
+ cat /config/settings.sh
#!/bin/sh
# Generated by pod-gateway
DNS_LOCAL_CIDRS="local"
NOT_ROUTED_TO_GATEWAY_CIDRS=""
VPN_BLOCK_OTHER_TRAFFIC="false"
VPN_INTERFACE="tun0"
VPN_LOCAL_CIDRS="10.0.0.0/8 192.168.0.0/16"
VPN_TRAFFIC_PORT="1194"
VXLAN_GATEWAY_FIRST_DYNAMIC_IP="20"
VXLAN_ID="42"
VXLAN_IP_NETWORK="172.16.0"
+ . /config/settings.sh
++ DNS_LOCAL_CIDRS=local
++ NOT_ROUTED_TO_GATEWAY_CIDRS=
++ VPN_BLOCK_OTHER_TRAFFIC=false
++ VPN_INTERFACE=tun0
++ VPN_LOCAL_CIDRS='10.0.0.0/8 192.168.0.0/16'
++ VPN_TRAFFIC_PORT=1194
++ VXLAN_GATEWAY_FIRST_DYNAMIC_IP=20
++ VXLAN_ID=42
++ VXLAN_IP_NETWORK=172.16.0
+ ip addr
+ grep -q vxlan0
++ /sbin/ip route
++ awk '/default/ { print $3 }'
+ K8S_GW_IP=
+ echo 'Deleting existing default GWs'
Deleting existing default GWs
+ ip route del 0/0
RTNETLINK answers: No such process
+ /bin/true
+ ping -c 1 -W 1000 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
ping: sendto: Network unreachable
+ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0@if83: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default 
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.42.1.214/24 brd 10.42.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::f804:83ff:fee4:98c5/64 scope link 
       valid_lft forever preferred_lft forever
+ ip route
10.42.0.0/16 via 10.42.1.1 dev eth0 
10.42.1.0/24 dev eth0 proto kernel scope link src 10.42.1.214 
++ cut -d ' ' -f 1
+ K8S_DNS_IP=10.43.0.10
++ dig +short pod-gateway.default.svc.cluster.local @10.43.0.10
+ GATEWAY_IP=';; connection timed out; no servers could be reached'

My temporary fix is to swap these two lines in front of ip route del 0/0 || /bin/true.

K8S_DNS_IP="$(cut -d ' ' -f 1 <<< "$K8S_DNS_IPS")"
GATEWAY_IP="$(dig +short "$GATEWAY_NAME" "@${K8S_DNS_IP}")"

K8S_DNS_IP="$(cut -d ' ' -f 1 <<< "$K8S_DNS_IPS")"
GATEWAY_IP="$(dig +short "$GATEWAY_NAME" "@${K8S_DNS_IP}")"

# Delete default GW to prevent outgoing traffic to leave this docker
echo "Deleting existing default GWs"
ip route del 0/0 || /bin/true

# After this point nothing should be reachable -> check
if ping -c 1 -W 1000 8.8.8.8; then
  echo "WE SHOULD NOT BE ABLE TO PING -> EXIT"
  exit 255
fi

# For debugging reasons print some info
ip addr
ip route

# Derived settings
NAT_ENTRY="$(grep "$(hostname)" /config/nat.conf || true)"
VXLAN_GATEWAY_IP="${VXLAN_IP_NETWORK}.1"

faisalbasha19 commented 1 year ago

Hi,

Even after swapping the lines you are still not able to resolve the GATEWAY_IP address using the kubernetes DNS ? if you completely comment the ip route del 0/0 line then you will notice that it resolves but it may not give you the ip address.

dannypv05261 commented 1 year ago

Sorry. I mean I faced the same issue and I can resolve the GATEWAY_IP address after swapping the lines.

faisalbasha19 commented 1 year ago

what was the ip address of GATEWAY_IP after resolving ? Was it kubernetes svc address or was it something else ? Because technically it should only give you an ip of the pod-gateway.default.svc.cluster.local service, but if you are going with default config then ClusterIP is set to none which means you will not get an IP address for this service. So I am curious as to what IP address did you get ?

dannypv05261 commented 1 year ago

From my understanding, if ClusterIP is none, it is called headless service. When we resolve the IP address of the headless service e.g. nslookup/dig, it will return different IP addresses for each pods (Endpoints) instead of one service cluster IP address. In the default config, there is only 1 replica of the pod-gateway, so it is returning the IP address of the one and only one Endpoint 10.42.1.x.

faisalbasha19 commented 1 year ago

But dont you ever get that the network is unreachable I always get this as my logs:

I have created the pod-gateway in default namespace and a terminal pod in a namespace called vpn where the webhook will inject two init containers and here is where I get stuck. This happens even after the swapping of lines.

kubectl logs terminal-574db4f9c6-2vhxd -n vpn -c gateway-init -f
+ cat /default_config/settings.sh
#!/bin/bash

 hostname of the gateway - it must accept vxlan and DHCP traffic
 clients get it as env variable
GATEWAY_NAME="$gateway"
 K8S DNS IP address
 clients get it as env variable
K8S_DNS_IPS="$K8S_DNS_ips"
 Blank  sepated IPs not sent to the POD gateway but to the default K8S
 This is needed, for example, in case your CNI does
 not add a non-default rule for the K8S addresses (Flannel does)
NOT_ROUTED_TO_GATEWAY_CIDRS=""

 Vxlan ID to use
VXLAN_ID="42"
 VXLAN need an /24 IP range not conflicting with K8S and local IP ranges
VXLAN_IP_NETWORK="172.16.0"
 Keep a range of IPs for static assignment in nat.conf
VXLAN_GATEWAY_FIRST_DYNAMIC_IP=20

 If using a VPN, interface name created by it
VPN_INTERFACE=tun0
 Prevent non VPN traffic to leave the gateway
VPN_BLOCK_OTHER_TRAFFIC=true
 If VPN_BLOCK_OTHER_TRAFFIC is true, allow VPN traffic over this port
VPN_TRAFFIC_PORT=1194
 Traffic to these IPs will be send through the K8S gateway
VPN_LOCAL_CIDRS="192.168.0.0/16 10.0.0.0/8"

DNS queries to these domains will be resolved by K8S DNS instead of
the default (typcally the VPN client changes it)
DNS_LOCAL_CIDRS="local"

 dnsmasq monitors directories. /etc/resolv.conf in a container is in another
 file system so it does not work. To circumvent this a copy is made using
 inotifyd
RESOLV_CONF_COPY=/etc/resolv_copy.conf

ICMP heartbeats are used to ensure the pod-gateway is connectable from the clients.
The following value can be used to to provide more stability in an unreliable network connection.
CONNECTION_RETRY_COUNT=1

If you use nftables for iptables you need to set this to yes
IPTABLES_NFT=no
+ . /default_config/settings.sh
++ GATEWAY_NAME=pod-gateway.default.svc.cluster.local
++ K8S_DNS_IPS=10.100.0.10
++ NOT_ROUTED_TO_GATEWAY_CIDRS=
++ VXLAN_ID=42
++ VXLAN_IP_NETWORK=172.16.0
++ VXLAN_GATEWAY_FIRST_DYNAMIC_IP=20
++ VPN_INTERFACE=tun0
++ VPN_BLOCK_OTHER_TRAFFIC=true
++ VPN_TRAFFIC_PORT=1194
++ VPN_LOCAL_CIDRS='192.168.0.0/16 10.0.0.0/8'
++ DNS_LOCAL_CIDRS=local
++ RESOLV_CONF_COPY=/etc/resolv_copy.conf
++ CONNECTION_RETRY_COUNT=1
++ IPTABLES_NFT=no
+ cat /config/settings.sh
#!/bin/sh
Generated by pod-gateway
DNS_LOCAL_CIDRS="local"
NOT_ROUTED_TO_GATEWAY_CIDRS=""
VPN_BLOCK_OTHER_TRAFFIC="true"
VPN_INTERFACE="tun0"
VPN_LOCAL_CIDRS="10.0.0.0/8 192.168.0.0/16"
VPN_TRAFFIC_PORT="1987"
VXLAN_GATEWAY_FIRST_DYNAMIC_IP="20"
VXLAN_ID="42"
VXLAN_IP_NETWORK="172.16.0"
+ . /config/settings.sh
++ DNS_LOCAL_CIDRS=local
++ NOT_ROUTED_TO_GATEWAY_CIDRS=
++ VPN_BLOCK_OTHER_TRAFFIC=true
++ VPN_INTERFACE=tun0
++ VPN_LOCAL_CIDRS='10.0.0.0/8 192.168.0.0/16'
++ VPN_TRAFFIC_PORT=1987
++ VXLAN_GATEWAY_FIRST_DYNAMIC_IP=20
++ VXLAN_ID=42
++ VXLAN_IP_NETWORK=172.16.0
+ grep -q vxlan0
+ ip addr
++ /sbin/ip route
++ awk '/default/ { print $3 }'
+ K8S_GW_IP=
+ ip route
+ ip a
+ K8S_DNS_IP=10.100.0.10
++ dig +short pod-gateway.default.svc.cluster.local @10.100.0.10
10.100.161.120 via 169.254.1.1 dev eth0
169.254.1.1 dev eth0 scope link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
3: eth0@if124782: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc noqueue state UP group default
    link/ether 8e:76:fb:6e:eb:99 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.16.76.99/32 scope global eth0
       valid_lft forever preferred_lft forever
+ GATEWAY_IP=';; UDP setup with 10.100.0.10#53(10.100.0.10) for pod-gateway.default.svc.cluster.local failed: network unreachable.
;; UDP setup with 10.100.0.10#53(10.100.0.10) for pod-gateway.default.svc.cluster.local failed: network unreachable.
;; UDP setup with 10.100.0.10#53(10.100.0.10) for pod-gateway.default.svc.cluster.local failed: network unreachable.'
faisalbasha@vmubuntu:~/k8s-es-tests/vpn-pods/testvpn-pod$ kubectl get pods -n vpn
NAME                        READY   STATUS                  RESTARTS      AGE
terminal-574db4f9c6-2vhxd   0/2     Init:CrashLoopBackOff   3 (36s ago)   94s

faisalbasha19 commented 1 year ago

Hey do you have any suggestions for the above issue ?

mglants commented 1 year ago

NOT_ROUTED_TO_GATEWAY_CIDRS: 10.244.0.0/16 10.96.0.0/24 Put that in your settings, config according to your CNI

angelnu / pod-gateway

Issue with dhclient in client_init.sh script #15

Details