antrea-io / antrea

Kubernetes networking based on Open vSwitch
https://antrea.io
Apache License 2.0
1.62k stars 346 forks source link

Egress traffic to egress gateway labelled Node's interface dropped when Egress subnet is configured different from node's network #6359

Open rajnkamr opened 1 month ago

rajnkamr commented 1 month ago

Describe the bug

Configure egress network other than node's network Antrea should route Egress traffic to the specified gateway egrssscrdresource. yaml apiVersion: crd.antrea.io/v1beta1 kind: Egress metadata: name: snat-testapp-ip spec: appliedTo: podSelector: matchLabels: app: antrea-test-app ###Select the Pods to which the SNAT Policy will be applied externalIPPool: external-ip-pool1

externalippoolcrdresource.yaml apiVersion: crd.antrea.io/v1beta1 kind: ExternalIPPool metadata: name: external-ip-pool1 spec: ipRanges:

label the node to be used as egress node kubectl label nodes nodes-worker network-role=egress-gateway

Egress interface is created 28: antrea-ext.10@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether 02:42:ac:12:00:04 brd ff:ff:ff:ff:ff:ff inet 10.10.0.2/24 brd 10.10.0.255 scope global antrea-ext.10 valid_lft forever preferred_lft forever

route is added 10.10.0.0/24 dev antrea-ext.10 proto kernel scope link src 10.10.0.2

egress gateway is reachable from node via default interface root@nodes-worker:/# ping 10.10.0.1 -I eth0 PING 10.10.0.1 (10.10.0.1) from 172.18.0.4 eth0: 56(84) bytes of data. 64 bytes from 10.10.0.1: icmp_seq=1 ttl=64 time=0.134 ms

To Reproduce

Apply crd resource yaml specified above try to communicate to external network from pod on egress-gateway labelled node/or any other pod on any other node. capture traffic at egress gateway labeled node's egress interface using tcpdump No traffic is seen at the egress interface(antrea-ext.10) on node

Expected

External Traffic should be routed to gateway 10.10.0.1(configured as separate interface for external reachability) Actual behavior

Egress Traffic not seen at gateway interface Versions:

Additional context

tnqn commented 1 month ago

It's really hard to read and understand this issue due to the format and missing punctuation. To make it understandable, could you format yaml/command output using markdown syntax and break paragraphs using empty lines?

For the issue itself, given you can ping the gateway 10.10.0.1 from eth0 which is not a VLAN interface, I doubt if the gateway really expects VLAN tag. Can you check if it's still reachable if you don't specify -I eth0 (and why you need to specify it in the first place)?