loft-sh / vcluster

vCluster - Create fully functional virtual Kubernetes clusters - Each vcluster runs inside a namespace of the underlying k8s cluster. It's cheaper than creating separate full-blown clusters and it offers better multi-tenancy and isolation than regular namespaces.
https://www.vcluster.com
Apache License 2.0
6.26k stars 398 forks source link

vcluster and virtual kubelet does not work together #1944

Open antoinetran opened 1 month ago

antoinetran commented 1 month ago

What happened?

When deploying kind, then vcluster, then interlink (which deploys a virtual kubelet), the virtual node from interlink is deleted by vcluster.

What did you expect to happen?

The virtual node appears when doing kubectl get node.

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

See https://github.com/interTwin-eu/interLink/issues/260

This error pattern appears in vcluster logs:

delete virtual node my-vk-node, because it is not needed anymore

Which means vcluster deletes it. The code from vcluster is at https://github.com/loft-sh/vcluster/blob/v0.19.4/pkg/controllers/resources/nodes/syncer.go#L271 , at SyncToHost . I think vcluster thinks of the virtual kubelet as one of its own virtual node, and deletes it because it does not recognize it.

Host cluster Kubernetes version

```console $ kubectl version Client Version: v1.28.2 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.29.0 ```

vcluster version

0.18.1 and 0.19.4

VCluster Config

``` # My vcluster.yaml / values.yaml here ```
antoinetran commented 1 month ago

Okay, after digging the code, I think this is what happen. When virtual kubelet creates a virtual node, though vcluster API, a virtual node is created in vcluster persistence (etcd). Then the vcluster reconcile function, on node creation event, is called. It checks if the virtual node (vcluster etcd) matches a physical node (known in the original kubernetes etcd) here. Since there is no physical node related to the virtual kubelet, vcluster deletes it, calling here, thus the log:

delete virtual node %s, because it is not needed anymore

A second problem, not happening for now, is that vcluster does not keep the node if it does not contains any pod (see code here). Which might be the case for the virtual kubelet in the beginning.

My recommendation: When a virtual node is created though vcluster API, but NOT though the host kubernetes API, then this node should be marked as not being managed by vcluster, neither for the test of whether the counterpart physical node exist, nor if the virtual node contains any pod. Its lifecycle should not be managed by vcluster.

antoinetran commented 1 month ago

Another possible fix, but a bit more ugly, is to have a vcluster configuration in helm, to let it not manage any virtual node, related to a regex pattern in vcluster HELM values.yaml.