CentaurusInfra / fornax

Fornax for autonomous and flexible edge computing
Apache License 2.0
8 stars 16 forks source link

User space gateway #43

Closed chenqianfzh closed 2 years ago

chenqianfzh commented 2 years ago

This PR presents a PoC work for an inter-cluster gateway to transmit packets between two mizar clusters.

A new package, gopacket, was imported to get the job done.

As the new package contains thousands of lines of code, for the convenience of the code reviewer, this PR is split into three commits:

  1. import gopacket to vendor folder (reviewer can simply ignore this bulky commit)
  2. updates for the new gopacket package: it contains the code change to incorporate the package into the project and pass the CICD
  3. user space inter-cluster gateway: this is the most essential part of this work, It is code for the inter-cluster gateway.

Verficiation

Step 1

Start two mizar clusters, each cluster contains a host as the inter cluster gateway, where there is no bouncer/divider deployed. Both clusters share the same VPC, “vpc1“. Cluster A is assigned the subnet 192.168.122.#, and cluster B is assigned the subnet of 192.168.0.#.

The environments I was using are:

Cluster A (192.168.122.0/24): image

to start this cluster:

  1. ssh into the machine qian-mizar-dev-gw
  2. make sure the repo /root/mizar is at the branch qian-icgw-122-mizar
  3. go to /root/mizar_cluster_scripts, and run
    ./build_docker_image.sh && ./restart_cluster.sh && ./create_vpc_nets.sh 

Cluster B (192.168.0.0/24):

image

to start this cluster:

  1. ssh into the machine edge-team-testing-master
  2. make sure the repo /root/mizar is at the branch qian-icgw-0-mizar
  3. go to /root/mizar_cluster_scripts, and run
    ./build_docker_image.sh && ./restart_cluster.sh && ./create_vpc_nets.sh 

Step 2

In both cluster master machiens, run the following command, and note down the IP address of the divider for VPC1 in each cluster

kubectl get dividers -o wide

Step 3

Deploy inter-cluster-gateway code to the gateway machines.

In machine image and machine image,

do the following: a. install libcap-dev package if it is not (gopacket package will not work without it):

sudo apt-get install libpcap-dev -y

c. git clone the fornax repo, and switch to the branch of user-space-gateway. d. build the icgw binary:

cd cloud/cmd/inter_cluster_gateway/
go build

Step 4

Restart the above two gateway machines. (after restart_cluster.sh once everything settles)

For some unknown reasons, the icgw binary will run into some errors after an XDP deployment is done. Seems the restart could solve this problem.

Step 5

start user-space gateway

In cluster A gateway machine, run

sudo ./inter_cluster_gateway -remote_icgw=[cluster_B_gateway_IP]  -local_divider=[Cluster_A_VPC1_divider_IP]

In cluster B gateway machine, run

sudo ./inter_cluster_gateway -remote_icgw=[cluster_A_gateway_IP]  -local_divider=[Cluster_B_VPC1_divider_IP]

Step 6

start two pods in two Clusters:

in cluster A master machine, run

kubectl create -f /root/mizar_cluster_scripts/vanilla-net2.yaml

in cluster B master machine, run

kubectl create -f /root/mizar_cluster_scripts/vanilla-net1.yaml

Find out the IP address of pod-in-net1 in cluster B with command "kubectl get pod pod-in-net1 -o wide" and note it down.

Step 7

From Cluster A ping the pod in the cluster B In cluster A master machine, run command:

kubectl exec -it pod-in-net2 -- /bin/ping [pod-in-net1_IP-in-cluster-B]

The command output shows the ping succeeds.

Step 8

in cluster B master machine, run

kubectl delete -f /root/mizar_cluster_scripts/vanilla-net1.yaml

to kill the pod.

Step 9

From Cluster A ping the pod in the cluster B run command:

kubectl exec -it pod-in-net2 -- /bin/ping [pod-in-net1_IP-in-cluster-B]

The command output shows the ping got stuck.

Appendix

A short video is recorded to show how it works.

https://github.com/pdgetrf/ArktosEdge/blob/main/slides/success_portal_ping.mp4

pdgetrf commented 2 years ago

@jshaofuturewei please review this PR too for knowledge transfer purpose.