networkop / meshnet-cni

a (K8s) CNI plugin to create arbitrary virtual network topologies
BSD 3-Clause "New" or "Revised" License
116 stars 27 forks source link

Add reconciliation for grpc-wires #57

Open alexmasi opened 1 year ago

alexmasi commented 1 year ago

When a grpc-wire enabled meshnet pod in a node restarts (due to OOM / Error, etc.) the grpc-wire info (wire/handler maps) is not persisted or reconciled on restart.

https://github.com/networkop/meshnet-cni/blob/d3ae64833f4710c0ba1810737df0bd5de414844f/daemon/grpcwire/grpcwire.go#L143

This leads to errors like the following:

SendToOnce (wire id - 77): Could not find local handle. err:interface 77 is not active

stemming from:

https://github.com/networkop/meshnet-cni/blob/d3ae64833f4710c0ba1810737df0bd5de414844f/daemon/grpcwire/grpcwire.go#L254

To make grpc-wire add on more resilient, reconciliation should be added (likely using the topology CRD)

alexmasi commented 1 year ago

https://github.com/networkop/meshnet-cni/pull/80

alexmasi commented 1 year ago

Testing the grpc wire reconciliation using a 150 node topology on KNE across 2 workers. Getting a few issues:

@kingshukdev

kingshukdev commented 1 year ago

@alexmasi It looks like old crd yamls has been used to deploy meshent with newer meshnet binary. So the CRD in definition in K8S and the CRD the binary supports are not in sync. Please take the latest yaml (manifest folder) from master branch for deployment and let us know it solves the issue.

alexmasi commented 1 year ago

Thanks Kingshuk, thats a mistake on my end. The grpc wire reconciliation appears to be working now. When I delete a meshnet pod it reloads with full information about the already created links. I appreciate the implementation!

However there is a separate issue with meshnet reconciliation in general. I tried deleting/recreating a pod during the topology creation and it mostly works except some of the topologies (in this case init containers for several of the router pods) get stuck waiting:

$ kubectl get pods -A -o wide | grep Init
ceos-150                         r111                                                          0/1     Init:0/1   0          37m   10.244.2.249   alexmasi-worker-2     <none>           <none>
ceos-150                         r112                                                          0/1     Init:0/1   0          42m   10.244.1.217   alexmasi-worker-1     <none>           <none>
ceos-150                         r113                                                          0/1     Init:0/1   0          41m   10.244.1.233   alexmasi-worker-1     <none>           <none>
ceos-150                         r124                                                          0/1     Init:0/1   0          43m   10.244.1.213   alexmasi-worker-1     <none>           <none>
ceos-150                         r125                                                          0/1     Init:0/1   0          42m   10.244.1.218   alexmasi-worker-1     <none>           <none>
ceos-150                         r126                                                          0/1     Init:0/1   0          41m   10.244.1.229   alexmasi-worker-1     <none>           <none>
ceos-150                         r5                                                            0/1     Init:0/1   0          44m   10.244.1.205   alexmasi-worker-1     <none>           <none>
ceos-150                         r6                                                            0/1     Init:0/1   0          42m   10.244.1.216   alexmasi-worker-1     <none>           <none>
ceos-150                         r7                                                            0/1     Init:0/1   0          41m   10.244.1.235   alexmasi-worker-1     <none>           <none>

$ kubectl logs r5 -n ceos-150 init-r5 | tail -1
Connected 2 interfaces out of 3
$ kubectl logs r6 -n ceos-150 init-r6 | tail -1
Connected 1 interfaces out of 3
$ kubectl logs r7 -n ceos-150 init-r7 | tail -1
Connected 2 interfaces out of 3

Note all but one of these cases happens on the worker node where the meshnet pod was deleted mid topology creation. Did you come across this issue in your testing @kingshukdev ?

kingshukdev commented 1 year ago

@alexmasi glad to know that recon worked.

I can think of few tricky situation if meashnet daemon is restarted during topology creation. It is very very time sensitive - meshnet daemon is not available but K8S is trying to create the next pod. If the meshnet daemon come up fast before K8S retries then it will go through.

How are you restating meshent daemon - is it "kill -9 pid" ? Once we know how you are restarting then we can try playing with that.

alexmasi commented 1 year ago

kubectl delete pod meshnet-****** -n meshnet

then k8 will auto bring up a new pod to match the intent

Cerebus commented 6 months ago

There seems to be a bug in #80. In my single-node cluster, I get:

time="2024-04-05T11:57:18-05:00" level=error msg="failed to run meshnet cni: <nil>"
time="2024-04-05T11:57:58-05:00" level=error msg="Add[c]: Failed to set a skipped flag on peer a"

For all pods after the first two or three pods.

> k get pods
NAME   READY   STATUS     RESTARTS   AGE
a      0/2     Init:0/1   0          24s
b      0/2     Init:0/1   0          24s
c      0/2     Init:0/1   0          24s
d      0/2     Init:0/1   0          24s
aa     2/2     Running    0          23s
dd     2/2     Running    0          23s

(Pods stick in Init b/c I have an initContainer that waits on all the interfaces to be added. Since the CNI client is failing, this initContainer never exits.)

ETA: In this deployment, Pods a/b/c/d are linked in a "diamond" network (peers are a-b, b-d, a-c, c-d, and b-c), and Pods aa/dd are linked to only one peer. So I speculate that this has something to do with Pods with multiple peers. More testing needed.