submariner-io / submariner

Networking component for interconnecting Pods and Services across Kubernetes clusters.
https://submariner.io
Apache License 2.0
2.43k stars 193 forks source link

Fail to replay a case in NAT traversal #1492

Closed YAXILALIANGFEIFAN closed 3 years ago

YAXILALIANGFEIFAN commented 3 years ago

mangelajo note : this seems to be fixed in 0.9.1 , the port annotations were not well picked up.

What happened: I have successfully deployed submariner to my 4 kubernetes clusters (version 1.19.7) created by kubeadm, and all of them works well include pods and services. The problem is that pods in cluster-b can't access to pods in cluster-c by cluster IP across cluster, but access to pods in cluster-a, cluster-b and cluster-d by cluster IP across cluster. The four clusters are deployed as Fig. 1, referring to the column "Public Cloud vs On-Premises" in this link https://submariner.io/operations/nat-traversal

my-NAT1-version3

Fig. 1


Details of the 4 clusters are listed in the table 1.

cluster-name role ip address pod cidr service cidr CNI version deployment mode kube-proxy mode note
cluster-a broker k8s (v1.19.7) 43.128.40.60 10.44.0.0/16 10.45.0.0/16 flannel v0.14.0 On-Premise iptables subctl v0.9.0
cluster-b mannaged k8s(v1.19.7) 43.128.85.30 10.144.0.0/16 10.145.0.0/16 flannel v0.14.0 On-Premise iptables subctl v0.9.0
cluster-c mannaged k8s(v1.19.7) 43.128.85.30 10.4.0.0/16 10.5.0.0/16 flannel v0.14.0 On-Premise iptables subctl v0.9.0
cluster-d mannaged k8s(v1.19.7) 150.109.237.90 10.88.0.0/16 10.89.0.0/16 flannel v0.14.0 On-Premise iptables subctl v0.9.0


The results of the pod network connectivity are summarized in the table 2.

cluster-name cluster-a cluster-b cluster-c cluster-d
cluster-a Y Y Y Y
cluster-b Y Y N Y
cluster-c Y N Y Y
cluster-d Y Y Y Y


How to reproduce it: (1) create 4 kubernetes clusters by kubeadm with no pod/service CIDR overlap (2) assign a NAT gateway for cluster-b and cluster-c (3) deploy broker on cluster-a (4) join cluster-a, cluster-b, cluster-c and cluster-d to broker cluster cluster-a cluster-a.sh.txt cluster-b.sh.txt cluster-c.sh.txt cluster-d.sh.txt (5) kubectl annotate node, restart the gateways and Router Port Mapping, referring to the column "Public Cloud vs On-Premises" in this link https://submariner.io/operations/nat-traversal (6) try verify-manually case by using a nginx service like https://submariner.io/getting-started/quickstart/k3s/#verify-manually


Anything else we need to know?: If I skip the step (5) in the part "How to reproduce it", it seems that I can get the same result as the table 2


Environment: Operating System of all the clusters: Ubuntu 18.04

for cluster-a subctl-cluster-a


for cluster-b

subctl-cluster-b

mangelajo commented 3 years ago

For this to work, you need to enable the natt discovery protocol at least on cluster B & C, and make sure that they can contact via their private ips 10.0.0.16 <> 10.0.0.33 to each other.

I see that you comment about enabling that, but I suspect that it's not working:

kubectl annotate node $GWC --kubeconfig C gateway.submariner.io/natt-discovery-port=4491
kubectl annotate node $GWC --kubeconfig C gateway.submariner.io/udp-port=4501
kubectl annotate node $GWD --kubeconfig D gateway.submariner.io/natt-discovery-port=4492
kubectl annotate node $GWD --kubeconfig D gateway.submariner.io/udp-port=4502

# restart the gateways to pick up the new setting
for cluster in C D;
do
  kubectl delete pod -n submariner-operator -l app=submariner-gateway --kubeconfig $cluster
done

if the NATT discovery protocol worked they should show NAT=no in the subctl show connections lists, and the private IP as remote endpoint, also "connected"

Could you:

1) provide subctl gather output for clusters C & D ? 2) Check if are Are ports 4491, 4501 4492 and 4502 reachable between the gateways of cluster C & D? 10.0.0.0/16 network.

Let's figure this out, thank you!

YAXILALIANGFEIFAN commented 3 years ago

for more details about the environment

cluster-c subctl-cluster-c


cluster-d

subctl-cluster-d


router port mapping router-port-mapping

mangelajo commented 3 years ago

It seems like updating to 0.9.1 fixed the issue.

The annotations for the UDP & NAT-discovery ports were not working well in 0.9.0

Running this on all clusters fixed it:

kubectl -n submariner-operator set image deployment/submariner-operator submariner-operator=quay.io/submariner/submariner-operator:0.9.1
kubectl patch Submariner submariner -n submariner-operator -p '{"spec": {"version": "0.9.1"}}' --type=merge

Now connectivity seems to be happier:

Showing Connection details
GATEWAY            CLUSTER    REMOTE IP       NAT  CABLE DRIVER  SUBNETS                     STATUS     RTT avg.     
vm-100-152-ubuntu  cluster-d  150.109.237.90  yes  libreswan     10.89.0.0/16, 10.88.0.0/16  connected  70.173606ms  
vm-32-154-ubuntu   cluster-a  43.128.40.60    yes  libreswan     10.45.0.0/16, 10.44.0.0/16  connected  33.927974ms  
vm-0-33-ubuntu     cluster-c  10.0.0.33       no   libreswan     10.5.0.0/16, 10.4.0.0/16    connected  348.441µs    
mangelajo commented 3 years ago

Ok, it seems to be working for 0.9.1 @YAXILALIANGFEIFAN please let's re-open it if we find any issue.

YAXILALIANGFEIFAN commented 3 years ago

I have done the double check for your solution, using subctl v0.9.1 and port mapping, and the case works well. If I don't execute port mapping, just using subctl v0.9.1, the case still doesn't work. I think the problem has been solved, thanks a lot!