networkop / meshnet-cni

a (K8s) CNI plugin to create arbitrary virtual network topologies
BSD 3-Clause "New" or "Revised" License
116 stars 28 forks source link

Meshnet fails to set up when the node-internal network is not on the default route #36

Closed Cerebus closed 2 years ago

Cerebus commented 2 years ago

Conditions: A k8s cluster where nodes are on a private network routed over a different interface than the default route.

Expectation: getVxLanSource() should discover the interface and IP that's guaranteed to be the node-internal network.

Actual: The IP for the interface carrying the default route is used instead.

Culprit code is here: https://github.com/networkop/meshnet-cni/blob/4bf3db70fb109e360797e6b05faa1c51ff053677/plugin/meshnet.go#L81

Reproduction: I found this trying to spin up meshnet on a multi-node minikube cluster. The VBox driver on macOS creates a host-only network for the cluster nodes and attaches it to eth1, using eth0 for NAT networking (which also gets the default route). Further, all the nodes in the cluster get the same IP on eth0. As a result, each pod gets the same srcIP, fails the peer pod IsAlive check, and does a skip. Lather, rinse, repeat.

Even if the nodes had distinct addresses, nodes are not reachable over the NAT network, so the vxlan link would fail to carry traffic (though setup might complete).

Suggested fix: IMHO, the best answer would be to get the node's srcIP off the Node resource status.addresses[] array InternalIP entry(s). At least one InternalIP address must be present, so when multiple are present just take the first one. The convention is that all addresses of InternalIP type are reachable by all nodes in the cluster, but this isn't guaranteed. However, where it isn't the case (e.g., a large multi-tenant cluster with separate worker pools) I would also expect to be using taints and affinities, so all the nodes with meshnet Pods scheduled will all be in the same pool and thus can talk to each other.

References: https://kubernetes.io/docs/reference/kubernetes-api/cluster-resources/node-v1/#NodeStatus

networkop commented 2 years ago

yep, I agree, totally. This was a quick and dirty hack. do you wanna do a PR?

Cerebus commented 2 years ago

I was just poking around to give it a go (hah!) but my go-fu is weak. You'd probably be an order of magnitude faster if you have the time. Let's see who gets there first. :)

networkop commented 2 years ago

I think the easiest way would involve min Go code. We can pass this value as an env var, since this value doesn't change that often anyway. We'd only need two changes:

Cerebus commented 2 years ago

I was just thinking that Pod.status.hostIP was the better way to go. Runtime injection and a lookup would work. I probably won't get to this until next week.

Cerebus commented 2 years ago

I got partway through this and realized it's not going to work. The plugin handles the vxlan creation and linking, but it's called by kubelet. An env var projected into meshnetd's container won't exist for the plugin.

Near as I can tell, there's two options left for discovering the proper iface from the plugin alone, and neither are good:

The problem is there's no consistency between k8s distros for either of these; e.g., k3s is a monolith binary (no kubelet process), and /etc/kubernetes/ isn't used to store kubeconfig by all installers. So I can't think of a way to implement something reliable.

The alternatives:

I'm willing to take a stab at the latter, but I'm really stretching my knowledge here. But ya learn something new every day.

networkop commented 2 years ago

yep, in fact, you could very well set the src_ip in the meshnetd somewhere after https://github.com/networkop/meshnet-cni/blob/4bf3db70fb109e360797e6b05faa1c51ff053677/daemon/meshnet/handler.go#L59 and return it to the CNI plugin. This way we could avoid doing https://github.com/networkop/meshnet-cni/blob/4bf3db70fb109e360797e6b05faa1c51ff053677/plugin/meshnet.go#L169 completely.