cloudnativelabs / kube-router

Kube-router, a turnkey solution for Kubernetes networking.
https://kube-router.io
Apache License 2.0
2.33k stars 471 forks source link

Unable To Route to IPv6 Service VIPs from Same Node #1698

Closed aauren closed 4 months ago

aauren commented 4 months ago

What happened?

When the traffic is originating from the same host that carries a copy of the IPv6 service VIP on one of its interfaces, kube-router is not able to successfully get a response.

This does not happen with IPv4 and appears to only affect IPv6 service VIPs.

What did you expect to happen?

kube-router service VIPs should always be available and testable from the host OS regardless of whether or not the IP address exists on that node or not. This assists in troubleshooting and has long been the case for IPv4 addresses.

How can we reproduce the behavior you experienced?

Steps to reproduce the behavior:

  1. Enable IPv6 mode in kube-router (--enable-ipv6=true)
  2. Add an IPv6 service Cluster IP range (--service-cluster-ip-range=2001:db8:42:1::/112)
  3. Create a test service that has DualStack configured that selects some pods:
    
    apiVersion: v1
    kind: Service
    metadata:
    annotations:
    kube-router.io/service.local: "true"
    purpose: "Creates a VIP for balancing an application"
    labels:
    name: whoami
    name: whoami
    namespace: default
    spec:
    ports:
    - name: flask
    port: 5000
    protocol: TCP
    targetPort: 5000
    ipFamilyPolicy: PreferDualStack
    selector:
    name: whoami
    type: ClusterIP

apiVersion: apps/v1 kind: DaemonSet metadata: name: whoami namespace: default spec: selector: matchLabels: name: whoami template: metadata: labels: name: whoami spec: securityContext: runAsUser: 1000 fsGroup: 1000 tolerations:

% curl "http://10.96.104.78:5000"
Hostname: whoami-ghshw IP: 127.0.0.1 IP: ::1 IP: 10.242.0.4 IP: 2001:db8:42:1000::4 IP: fe80::29:a2ff:fe53:166 RemoteAddr: 10.95.0.131:43788 GET / HTTP/1.1 Host: 10.96.104.78:5000 User-Agent: curl/7.81.0 Accept: /

5. Attempt to route to the IPv6 Service VIP from on the node and see that it fails:
```sh
% curl --max-time 2 "http://[2001:db8:42:1::d45a]:5000"
curl: (28) Connection timed out after 2001 milliseconds

System Information (please complete the following information)

Additional context

When kube-router adds Service VIP addresses to the kube-dummy-if interface it has always added an extra local route which mutates the traffic to look like it comes from the primary node IP address: https://github.com/cloudnativelabs/kube-router/blob/master/pkg/controllers/proxy/linux_networking.go#L170-L178

Like the comment there says, this keeps Linux routing from trying to emit the source traffic for local traffic routing patterns from the VIP itself, which won't ever route back to the originating process.

There appears to be a difference in the way that iproute2 adds IPv4 and IPv6 addresses. When it creates IPv4 addresses we see the following output:

% ip route show table all | grep $(kubectl get service -n default whoami -o jsonpath='{.spec.clusterIPs[0]}')
local 10.96.104.78 dev kube-dummy-if table local proto kernel scope host src 10.95.0.131

This represents only the route that kube-router adds and nothing else.

However, if we look at an IPv6 address we see the following:

 ip -6 route show table all | grep $(kubectl get service -n default whoami -o jsonpath='{.spec.clusterIPs[1]}')
2001:db8:42:1::d45a dev kube-dummy-if proto kernel metric 256 pref medium
local 2001:db8:42:1::d45a dev kube-dummy-if table local proto kernel metric 0 pref medium
local 2001:db8:42:1::d45a dev kube-dummy-if table local proto kernel src 2600:1f18:5302:3d00:d710:3df9:102b:67d7 metric 1024 pref medium

Only the last route is our route, and the first 2 routes seem to be added by iproute2 itself for some reason. If these two routes are removed, then the traffic flows as it should:

% sudo ip -6 route del 2001:db8:42:1::d45a dev kube-dummy-if proto kernel metric 256 pref medium

% curl --max-time 2 "http://[2001:db8:42:1::d45a]:5000"                                         
curl: (28) Connection timed out after 2000 milliseconds

% sudo ip -6 route del local 2001:db8:42:1::d45a dev kube-dummy-if table local proto kernel metric 0 pref medium

% curl --max-time 2 "http://[2001:db8:42:1::d45a]:5000"                                                         
Hostname: whoami-ghshw
IP: 127.0.0.1
IP: ::1
IP: 10.242.0.4
IP: 2001:db8:42:1000::4
IP: fe80::29:a2ff:fe53:166
RemoteAddr: [2600:1f18:5302:3d00:d710:3df9:102b:67d7]:48476
GET / HTTP/1.1
Host: [2001:db8:42:1::d45a]:5000
User-Agent: curl/7.81.0
Accept: */*