Closed CroutonDigital closed 1 year ago
When I send packet to service use command curl http://10.43.26.206:9090
08:09:43.557956 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 64)
10.42.7.42.57615 > 10.43.26.206.9090: Flags [SEW], cksum 0xc2fe (correct), seq 1239973897, win 65535, options [mss 1356,nop,wscale 6,nop,nop,TS val 418076138 ecr 0,sackOK,eol], length 0
08:09:44.560509 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 64)
10.42.7.42.57615 > 10.43.26.206.9090: Flags [S], cksum 0xbfd5 (correct), seq 1239973897, win 65535, options [mss 1356,nop,wscale 6,nop,nop,TS val 418077139 ecr 0,sackOK,eol], length 0
08:09:45.561348 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 64)
10.42.7.42.57615 > 10.43.26.206.9090: Flags [S], cksum 0xbbec (correct), seq 1239973897, win 65535, options [mss 1356,nop,wscale 6,nop,nop,TS val 418078140 ecr 0,sackOK,eol], length 0
08:09:46.561143 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 64)
10.42.7.42.57615 > 10.43.26.206.9090: Flags [S], cksum 0xb803 (correct), seq 1239973897, win 65535, options [mss 1356,nop,wscale 6,nop,nop,TS val 418079141 ecr 0,sackOK,eol], length 0
08:09:47.563019 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 64)
10.42.7.42.57615 > 10.43.26.206.9090: Flags [S], cksum 0xb41a (correct), seq 1239973897, win 65535, options [mss 1356,nop,wscale 6,nop,nop,TS val 418080142 ecr 0,sackOK,eol], length 0
08:09:48.562645 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 64)
10.42.7.42.57615 > 10.43.26.206.9090: Flags [S], cksum 0xb031 (correct), seq 1239973897, win 65535, options [mss 1356,nop,wscale 6,nop,nop,TS val 418081143 ecr 0,sackOK,eol], length 0
when I send packed to pod ip all ok: curl http://10.42.7.207:9090
08:11:27.416767 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 64)
10.42.7.42.57666 > 10.42.7.207.9090: Flags [SEW], cksum 0x91e9 (correct), seq 4180628869, win 65535, options [mss 1356,nop,wscale 6,nop,nop,TS val 2962259299 ecr 0,sackOK,eol], length 0
08:11:27.416865 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 60)
10.42.7.207.9090 > 10.42.7.42.57666: Flags [S.E], cksum 0x237b (incorrect -> 0x7ddc), seq 4125798800, ack 4180628870, win 64704, options [mss 1360,sackOK,TS val 271347987 ecr 2962259299,nop,wscale 7], length 0
08:11:27.472540 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 52)
10.42.7.42.57666 > 10.42.7.207.9090: Flags [.], cksum 0xa104 (correct), ack 1, win 2058, options [nop,nop,TS val 2962259354 ecr 271347987], length 0
08:11:27.567033 IP (tos 0x2,ECT(0), ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 132)
10.42.7.42.57666 > 10.42.7.207.9090: Flags [P.], cksum 0xe117 (correct), seq 1:81, ack 1, win 2058, options [nop,nop,TS val 2962259354 ecr 271347987], length 80
08:11:27.567096 IP (tos 0x0, ttl 63, id 8131, offset 0, flags [DF], proto TCP (6), length 52)
10.42.7.207.9090 > 10.42.7.42.57666: Flags [.], cksum 0x2373 (incorrect -> 0xa62e), ack 81, win 505, options [nop,nop,TS val 271348138 ecr 2962259354], length 0
08:11:27.570339 IP (tos 0x2,ECT(0), ttl 63, id 8132, offset 0, flags [DF], proto TCP (6), length 362)
10.42.7.207.9090 > 10.42.7.42.57666: Flags [P.], cksum 0x24a9 (incorrect -> 0x4738), seq 1:311, ack 81, win 505, options [nop,nop,TS val 271348141 ecr 2962259354], length 310
08:11:27.624034 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 52)
10.42.7.42.57666 > 10.42.7.207.9090: Flags [.], cksum 0x9e50 (correct), ack 311, win 2053, options [nop,nop,TS val 2962259507 ecr 271348141], length 0
08:11:27.719946 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 52)
10.42.7.42.57666 > 10.42.7.207.9090: Flags [F.], cksum 0x9e4f (correct), seq 81, ack 311, win 2053, options [nop,nop,TS val 2962259507 ecr 271348141], length 0
08:11:27.720080 IP (tos 0x0, ttl 63, id 8133, offset 0, flags [DF], proto TCP (6), length 52)
10.42.7.207.9090 > 10.42.7.42.57666: Flags [F.], cksum 0x2373 (incorrect -> 0xa3c4), seq 311, ack 82, win 505, options [nop,nop,TS val 271348291 ecr 2962259507], length 0
08:11:27.776932 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 52)
10.42.7.42.57666 > 10.42.7.207.9090: Flags [.], cksum 0x9d20 (correct), ack 312, win 2053, options [nop,nop,TS val 2962259659 ecr 271348291], length 0
when I try connect inside OpenVPN pod, curl http://10.43.26.206:9090
all ok
First screen request to pod, second to service k8s
Hey @CroutonDigital, I was not able to get a full picture from the information you provided and maybe I went the wrong path trying to follow your description. Can you try to be as precise as possible about the scenarios, especially from where you try to reach what?
Here is what I understood so far, but please read it carefully and correct/add missing information:
10.42.7.42
10.1.0.101
10.42.7.207
10.43.26.206
when I try connect inside OpenVPN pod, curl http://10.43.26.206:9090/ all ok
OVPN Pod
> Service (Web-App)
> Web-App Pod
: OK
OpenVPN client can connect to pod use pod ip and port [...]
OVPN Client
> OVPN Pod
> Web-App Pod
: OK
[...] but if use service ip and service port connection timeout
OVPN Client
> OVPN Pod
> Service (Web-App)
> Web-App Pod
: Fail
Please answer the following points:
Node (OVPN)
? -> Please provide your full command you have used for the traceRoute path:
Not Worked: OVPN CLIENT (OS X with VPN IP 10.8.0.0/24 range) > LOAD BALANCER TCP 1194 > k8s > OVPN POD (10.42.7.42) > GRPC BACKEND SERVICE (10.43.26.206:9090) > GRPC BACKEND POD (10.42.7.207:9090)
Worked: OVPN CLIENT (OS X with VPN IP 10.8.0.0/24 range) > LOAD BALANCER TCP 1194 > k8s > OVPN POD (10.42.7.42) > GRPC BACKEND POD (10.42.7.207:9090)
OVPN Client: 10.8.0.0/24 range (MASKED IP OVPN POD 10.42.7.42 use SNAT RULE IPTABLES) OVPN Pod: 10.42.7.42 Node (OVPN): 10.1.0.101 Web-App Pod: 10.42.7.207 Service (Web-App): 10.43.26.206
vpn.pcap.zip Trace is taked on OVPN pod
inside OVPN pod entrypoint.sh:
#!/bin/bash
set -e
mkdir -p /dev/net
mknod /dev/net/tun c 10 200
chmod 600 /dev/net/tun
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
/bin/sleep 2
openvpn --config /opt/openvpn/server.conf
that OVPN pod use on current GKE cluster and worked fine
service definition
resource "kubernetes_service" "binance_test_futures_client" {
metadata {
name = "binance-test-futures-client"
namespace = local.namespace_name
}
spec {
port {
port = 9090
target_port = 9090
protocol = "TCP"
}
selector = {
app = kubernetes_daemonset.binance_test_futures_client.metadata[0].labels.app
}
}
}
Thx @CroutonDigital!
I can't see the that the service LB is even trying to act on the masqueraded packet destined to the cluster IP. Can you please try it again with the following cilium configuration?
ipam:
mode: kubernetes
k8s:
requireIPv4PodCIDR: true
kubeProxyReplacement: true
routingMode: native
ipv4NativeRoutingCIDR: "10.0.0.0/8"
endpointRoutes:
enabled: true
loadBalancer:
acceleration: native
bpf:
masquerade: true
socketLB:
hostNamespaceOnly: true
egressGateway:
enabled: true
MTU: 1450
This will skip the socket LB for services when inside a pod namespace, in favor of the service LB at the pod interface (tc load balancer).
I change cilium values on kube.tf
cilium_values = <<EOT
ipam:
mode: kubernetes
k8s:
requireIPv4PodCIDR: true
kubeProxyReplacement: true
routingMode: native
ipv4NativeRoutingCIDR: "10.0.0.0/8"
endpointRoutes:
enabled: true
loadBalancer:
acceleration: native
bpf:
masquerade: true
socketLB:
hostNamespaceOnly: true
egressGateway:
enabled: true
MTU: 1450
EOT
same issue, also I have additional VM attached to the hetwork and same issue with service communicate.
I take dump from OVPN POD: dump.pcap.zip
OVPN Client utun3 inet 10.8.0.6 --> 10.8.0.5
OVPN Pod: eth0 10.42.3.107 | tun0: 10.8.0.1
Node (OVPN): 10.1.0.102
Web-App Pod: 10.42.2.158
Service (Web-App): 10.43.26.206
Hey @CroutonDigital, that's strange. Are you sure the new configuration is applied successfully? You can enforce it with the following command: kubectl -n kube-system rollout restart daemonset/cilium
Here is a new version that explicitly allows external access to Cluster IPs (bpf.lbExternalClusterIP: true
). Imho this should not be necessary as the IP should be SNATed with the Pod IP, but just in case it is smarter than I thought.
ipam:
mode: kubernetes
k8s:
requireIPv4PodCIDR: true
kubeProxyReplacement: true
routingMode: native
ipv4NativeRoutingCIDR: "10.0.0.0/8"
endpointRoutes:
enabled: true
loadBalancer:
acceleration: native
bpf:
masquerade: true
lbExternalClusterIP: true
socketLB:
hostNamespaceOnly: true
egressGateway:
enabled: true
MTU: 1450
also I have additional VM attached to the hetwork and same issue with service communicate.
This will not work out of the box because the underlying network does not know about where to route the service CIDR traffic to. Cluster IPs do not belong to a single node as they are only virtually known by the service load balancers inside of the nodes.
Here some more points you can try/verify:
iptables -t nat -L -n -v
, sysctl net.ipv4.ip_forward
, ip a
and ip route
from inside of the OpenVPN Pod after you tried to reach the service from the OpenVPN client?ipam:
mode: kubernetes
k8s:
requireIPv4PodCIDR: true
kubeProxyReplacement: true
routingMode: native
ipv4NativeRoutingCIDR: "10.42.0.0/16"
bpf:
lbExternalClusterIP: true
socketLB:
hostNamespaceOnly: true
MTU: 1450
Info: These settings also disable XDP, so that tcpdump could be able to see more with this setup. A new trace could be worth it after applying this.
@M4t7e it seems worked after kubectl -n kube-system rollout restart daemonset/cilium )))))
I will test today, I write feedback for you!
Yes, All works fine.
Thank you!
Description
I deploy OpenVPN pod for development team access to k8s services. OpenVPN client can connect to pod use pod ip and port, but if use service ip and service port connection timeout.
I think issue same as:
https://serverfault.com/questions/924773/openvpn-is-not-connecting-to-services-behind-it-iptables
Please help!
Kube.tf file
Screenshots
No response
Platform
Linux