nirmata / kube-static-egress-ip

Kubernetes CRD to manage static egress IP addresses for workloads
Apache License 2.0
214 stars 38 forks source link

make kube-static-egress-ip work with overlay network CNI's #13

Open murali-reddy opened 5 years ago

murali-reddy commented 5 years ago

kube-static-egress-ip works with assumption that director node can forward traffic to the gateway nodes. CNI's can be broadly considered to fall into two categories. Once that work using overlay networks, other using direct routing.

Flannel (Vxlan backend), Weave etc use VXLAN to overlay pod-to-pod traffic on underlay traffic. So underlay network never see the pod-to-pod or pod-to-node traffic.

Current implementation of kube-static-egress-ip works only with CNI that support direct routing of the traffic from pods to a node. For e.g calico, kube-router, Flannel (host-gw backend) allows pod traffic to be sent as is to a different node.

This issue is placeholder to enhance kube-static-egress-ip to support static egress IP functionality even for CNI's that uses overlay networking. Basically kube-static-egress-ip should be able to steer traffic from director node to gateway node using overlay network.

aleks-mariusz commented 5 years ago

any hope this is available any time soon? :-)

uablrek commented 5 years ago

I tried kube-static-egress-ip with flannel on k3s (great program!). First the address on the flannel.1 interface must be used as gw address;

kubectl annotate node vm-003 "nirmata.io/staticegressips-gateway=10.42.1.0"

I don't think this can be avoided on any overlay network, you must specify the gw address of the overlay.

But on director nodes the address is not directly usable;

# ip route add 192.168.2.0/24 via 10.42.1.0 table kube-static-egress-ip
Error: Nexthop has invalid gateway.

So the route setup fill fail. But the route can be setup manually on the director nodes with;

ip route add 192.168.2.0/24 via 10.42.1.0 dev flannel.1 onlink table kube-static-egress-ip

i.e you must insert "dev flannel.1 onlink". This works, I have tested.

Proposal

A "fairly" easy solution in this case is to always use these command parameters. If a direct route works (non-overlay CNI-plugins) the command will work anyway.

It is "fairly" easy because you must get the interface to use, probably with;

# ip route get 10.42.1.0
10.42.1.0 via 10.42.1.0 dev flannel.1 src 10.42.0.0 uid 0 
    cache 

and then use it in the route setup.

This sould fix the flannel case while not trashing for direct routed CNI-plugins, but I don't know about other overlay CNI-plugins.

murali-reddy commented 5 years ago

I don't think this can be avoided on any overlay network, you must specify the gw address of the overlay.

@uablrek thanks for the insight. i will test it out and see how that works.

uablrek commented 5 years ago

Calico (tunnel mode) desn't work either :disappointed:

uablrek commented 5 years ago

Update on calico

On the "gateway" the SNAT rule is inserted but calico uptates (sync) iptables in intervals and the SNAT rule does not work any more. So immediately after the egress service is created the egress ip is translated. Some time later (~ a minute) the function is disabled.

The trick with "onlink" that worked with flannel does not work for calico. The overlay-address for the gateway is routed already and can't be used. If the "real" node address is used with an "onlink" to the tunl0 device packets get's trough but are SNAT'ed already on the director to it's overlay address.

Bottom-line; it seems very hard to find a CNI-plugin-agnostic solution.

aleks-mariusz commented 5 years ago

can you give info about the SNAT rule? i figure there should be a way to match the outgoing IP's address and rewrite/mangle it to come from the egress IP desired.. ?

murali-reddy commented 5 years ago

@uablrek I tried your suggestion of using overlay address as gateway. But i ran to into issue with martian packets. Did you hit any issue with martian packets?

uablrek commented 5 years ago

@murali-reddy No, but when working with k3s I have discovered that they seem to have found a way to accept martian packets, but I have not figured out how. Perhaps my setup only works with k3s. Do you use k3s?

K3s does DNAT to 127.0.0.1:6443 for k8s-api-server access which should cause a "martial destination", but is doesn't.

uablrek commented 5 years ago

A (the?) reason for setting the overlay address as gateway is to force the forwarded traffic to use the overlay rather than taking a direct route to the gw.

matthiassb commented 3 years ago

All updates on this?

uablrek commented 3 years ago

@matthiassb While not beeing involved in development I think it is extremely hard (read "impossible") to implement this so it works for any CNI-plugin with any network overlay. Unless you can live with always using a direct routed CNI-plugin I would advise to look for another solution. E.g an egress gateway as Istio uses. Perhaps the eBPF based CNI-plugins (Cilium, Calico with eBFP backend) or the OVS based CNI-plugins can provide a way to use a specified egress address. But I think you must accept a CNI-specific solution.