submariner-io / submariner

Networking component for interconnecting Pods and Services across Kubernetes clusters.
https://submariner.io
Apache License 2.0
2.4k stars 188 forks source link

In the same cluster of Submariner, my non-gateway node vxlan interface cannot ping the gateway node's vxlan interface. #3134

Closed JacobLi11 closed 2 days ago

JacobLi11 commented 3 weeks ago

What happened: In the same cluster of Submariner, my non-gateway node vxlan interface cannot ping the gateway node's vxlan interface.

What you expected to happen: I hope these two nodes can ping How to reproduce it (as minimally and precisely as possible):

export SERVER_IP="192.168.3.36"
kind create cluster --config - <<EOF
kind: Cluster
name: broker
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
networking:
  apiServerAddress: $SERVER_IP
  podSubnet: "10.7.0.0/16"
  serviceSubnet: "10.77.0.0/16"
EOF

kind create cluster --config - <<EOF
kind: Cluster
name: c1
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
networking:
  apiServerAddress: $SERVER_IP
  podSubnet: "10.8.0.0/16"
  serviceSubnet: "10.88.0.0/16"
EOF

kind create cluster --config - <<EOF
kind: Cluster
name: c2
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
networking:
  apiServerAddress: $SERVER_IP
  podSubnet: "10.9.0.0/16"
  serviceSubnet: "10.99.0.0/16"
EOF
subctl --context kind-broker deploy-broker
subctl --context kind-c1 join broker-info.subm --clusterid c1
subctl --context kind-c2 join broker-info.subm --clusterid c2

Anything else we need to know?:

image image

Only request packet but no reply I also deployed it in AWS EKS and had the same problem

Environment:

JacobLi11 commented 3 weeks ago

Only icmp reqest ,no reply I think the two ends can communicate normally, but I don't know why my icmp has no reply

JacobLi11 commented 3 weeks ago

root@c2-worker:/# ip -d link show vx-submariner 10: vx-submariner: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ether 36:e4:24:36:bc:3a brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535 vxlan id 100 srcport 0 0 dstport 4800 nolearning ttl auto ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 root@c2-worker:/# bridge fdb show dev vx-submariner 00:00:00:00:00:00 dst 172.18.0.5 self permanent 00:00:00:00:00:00 dst 172.18.0.6 self permanent root@c2-worker:/# This is my interface information and fdb table

JacobLi11 commented 3 weeks ago

-A SUBMARINER-POSTROUTING -s 240.0.0.0/8 -o vx-submariner -j SNAT --to-source 10.9.0.1

When I delete this iptables rule, everything works fine. I don't quite understand why this message is needed. Doesn't this make the vxlan tunnel encapsulation ineffective?

yboaron commented 3 weeks ago

A. Submariner implements the egress part for inter-cluster traffic and lets the CNI handle ingress direction (after IPsec decryption).

A.1 So, for podA@non_gw_node@clusterA to communicate with podB@non_gw_node@clusterB , submariner will handle podA@non_gw_node --> gw_node@clusterA (via vx-submariner interface ) --> IPSec tunnel to remote cluster

A.2 CNI should forward the packet to podB@non_gw_node@clusterB

B.

-A SUBMARINER-POSTROUTING -s 240.0.0.0/8 -o vx-submariner -j SNAT --to-source 10.9.0.1

This rule is used to support communication from HostNetwork pods (that use node's IP address) to remoteCluster, so SRC ip address is SNATed to node's CNI interface IP.

C. Did you try checking inter-cluster connectivity between clusters ? you can use subctl verify

JacobLi11 commented 3 weeks ago

@yboaron Thank you for your answer~ My traffic here is not cross-cluster traffic, but traffic within this cluster. I want to detect the connectivity between the non-gateway node and the vx-submariner interface on the gateway node, and then I found that the two interfaces cannot be pinged normally.

yboaron commented 2 weeks ago

Submariner implements inter-cluster connectivity and by design egress part is handled by Submariner while the CNI is supposed to handle ingress, so the connectivity between vx-submariner interfaces on different nodes in the same cluster should not work as you noticed and it is expected.

Can you elaborate on why you are checking connectivity within a cluster via vx-submariner interfaces? is this for troubleshooting inter-cluster data path connectivity issue ?

JacobLi11 commented 2 days ago

I have this requirement in my use, but I already know the solution, thank you, I will close this issue