antrea-io / antrea

Kubernetes networking based on Open vSwitch
https://antrea.io
Apache License 2.0
1.62k stars 346 forks source link

Antrea EgressIP does not work if wireGuard is enabled #6190

Open adolfomaltez opened 2 months ago

adolfomaltez commented 2 months ago

Describe the bug

When an egressIP is created and applied to pods, it does not work correctly if wireGuard is enabled. The egressIP works only for the pod that is on the node that takes the egressIP. The rest of the pods on the nodes that do not have the egressIP lose connectivity to the outside of the cluster. If wireGuard is disabled, egressIP works correctly for all pods.

To Reproduce A kubernetes cluster is created with kind.

git clone https://github.com/antrea-io/antrea.git
git checkout release-1.15
docker pull projects.registry.vmware.com/antrea/antrea-ubuntu:v1.15.0
./ci/kind/kind-setup.sh --images projects.registry.vmware.com/antrea/antrea-ubuntu:v1.15.0 create cluster

antrea is installed kubectl apply -f https://github.com/antrea-io/antrea/releases/download/v1.15.0/antrea.yml

egressIP is enabled (Egress: true)

kubectl edit cm antrea-config -n kube-system
kubectl rollout restart deployment/antrea-controller -n kube-system
kubectl rollout restart daemonset/antrea-agent -n kube-system

A deployment and egressIP are created.

kubectl create -f hello-world-egressIP.yaml

Test connection to an outside service (nginx on laptop). All 3x pods show egressIP as their source IP.

wireGuard is enabled ( trafficEncryptionMode: "wireGuard" )

kubectl edit cm antrea-config -n kube-system
kubectl rollout restart deployment/antrea-controller -n kube-system
kubectl rollout restart daemonset/antrea-agent -n kube-system

Test connection to an outside service (nginx on laptop). Only the pod on the node with the egressIP show egressIP as their source IP. The other 2x pods fail to connect to the outside cluster (Operation timed out).

Expected It is expected that the 3 pods of the deployment will use the egressIP as the source IP, to a service external to the cluster, even with wireGuard enable.

Actual behavior Only the pod running on the node that takes the egressIP works correctly. Pods on different nodes cannot access services outside the cluster. If wireGuard is disabled, all pods work correctly with the egressIP.

Versions:

Please provide the following information:

Additional context hello-world-egressIP.yaml.txt

antoninbas commented 2 months ago

Hi @adolfomaltez, and thanks for the detailed bug report. I was able to confirm what you are observing, i.e. Egress cannot be used when enabling WireGuard.

This is an interesting case. First I want to point out that the way Egress traffic is sent to the "Egress Node" (the Node to which the Egress IP is currently assigned) is using a Geneve (by default) tunnel. Even when WireGuard is enabled, the Egress implementation tries to use the default non-encrypted Geneve tunnel. Starting with Antrea v1.15 however, we are no longer creating the default tunnel port when WireGuard is enabled. This was introduced in this PR: https://github.com/antrea-io/antrea/pull/5885. This is the first reason why Egress is not working with WireGuard. But even if Antrea is downgraded to v1.14.3 (which does not include this patch, and hence still creates the Geneve tunnel port when WireGuard is enabled), Egress is still not working. It is because Linux Reverse Parse Filtering (rp_filter) is dropping the traffic when it gets to the Egress Node (before SNAT). You can disable rp_filter on antrea-gw0, and this will let you reach out your external server, using the Egress IP as source IP as desired. However, at that point the return path is still broken. This is because return traffic will need to take the WireGuard tunnel from the Egress Node back to the source Node (where the client is). As you can see, this creates an asymmetry in the path, and we configure Wireguard to only allow Pod IPs, which is yet another blocker here.

If we want to support WireGuard with Egress, we will need to revert #5885, and make adjustments to the datapath so that return traffic can be forwarded correctly (not through the WireGuard tunnel). This could probably be achieved using a fwmark / ctmark and policy-based routing? cc @tnqn

Something that could be considered is whether WireGuard can be used to encrypt Egress traffic between the Egress Node and the source Node. The kind of source-based routing we need for Egress is probably not easy to achieve with WireGuard, and encryption is not really required IMO since Egress traffic is destined to exit the cluster.

tnqn commented 2 months ago

If we want to support WireGuard with Egress, we will need to revert #5885, and make adjustments to the datapath so that return traffic can be forwarded correctly (not through the WireGuard tunnel). This could probably be achieved using a fwmark / ctmark and policy-based routing? cc @tnqn

Yes, the proposal should work. We could allocate one ctmark bit to represent it, set it after matching outgoing Egress traffic, and restoring it to fwmark for traffic coming from interfaces except antrea-gw0, then route it to antrea-gw0 via policy routing.

Something that could be considered is whether WireGuard can be used to encrypt Egress traffic between the Egress Node and the source Node. The kind of source-based routing we need for Egress is probably not easy to achieve with WireGuard, and encryption is not really required IMO since Egress traffic is destined to exit the cluster.

Agreed. And it has been documented that "Antrea can leverage WireGuard to encrypt Pod traffic between Nodes."

antoninbas commented 2 months ago

@luolanzone maybe we could scope this for v2.1?

tnqn commented 2 months ago

Created a milestone for v2.1 and added the issue to it.

antoninbas commented 2 months ago

A quick note that we will need to pay attention to the MTU in that case, and account for both WireGuard and Geneve (we should apply the max MTU deduction of the 2).

withlin commented 1 month ago

same issue for antrea v1.15.0