loft-sh / vcluster

vCluster - Create fully functional virtual Kubernetes clusters - Each vcluster runs inside a namespace of the underlying k8s cluster. It's cheaper than creating separate full-blown clusters and it offers better multi-tenancy and isolation than regular namespaces.
https://www.vcluster.com
Apache License 2.0
6.16k stars 372 forks source link

Respect underlying node taints #105

Closed Vertiwell closed 2 years ago

Vertiwell commented 2 years ago

Hi, Not sure if I'm missing a setting (newish to k3s) however nodes that I specify with --node-taint CriticalAddonsOnly=true:NoExecute in the underlying K3S still receive pods from vCluster (and thus these pods do not start).

FabianKramm commented 2 years ago

@Vertiwell thanks for creating this issue! If you want to taint a node within the vcluster that should take effect also on the host cluster (since vcluster has no scheduler itself), you will need to enable --sync-node-changes via:

values.yaml:

rbac:
  clusterRole:
    create: true
syncer:
  extraArgs:
  - --sync-all-nodes
  - --sync-node-changes
  - --fake-nodes=false

Then create the vcluster with:

vcluster create test -n test -f values.yaml

If you taint a node now within the vcluster no pods should get scheduled there anymore.

Vertiwell commented 2 years ago

@FabianKramm thanks for the reply, appreciate the assistance, started up a new vcluster with the above settings, the new k3s started on a worker node successfully, however coredns was then started on a control node with --node-taint CriticalAddonsOnly=true:NoExecute.

Coredns then sits idle constantly erroring:

E0803 07:42:33.296251 1 reflector.go:127] pkg/mod/k8s.io/client-go@v0.19.2/tools/cache/reflector.go:156: Failed to watch v1.Namespace: failed to list v1.Namespace: Get "https://10.43.2.35:443/api/v1/namespaces?limit=500&resourceVersion=0": dial tcp 10.43.2.35:443: connect: no route to host

E0803 07:42:43.066595 1 reflector.go:127] pkg/mod/k8s.io/client-go@v0.19.2/tools/cache/reflector.go:156: Failed to watch v1.Service: failed to list v1.Service: Get "https://10.43.2.35:443/api/v1/services?limit=500&resourceVersion=0": dial tcp 10.43.2.35:443: connect: no route to host

E0803 07:42:44.730891 1 reflector.go:127] pkg/mod/k8s.io/client-go@v0.19.2/tools/cache/reflector.go:156: Failed to watch v1.Endpoints: failed to list v1.Endpoints: Get "https://10.43.2.35:443/api/v1/endpoints?limit=500&resourceVersion=0": dial tcp 10.43.2.35:443: connect: no route to host

[INFO] plugin/ready: Still waiting on: "kubernetes"

Commands run: VCNAME=vcluster-1; DOMAIN=example.com; mkdir -p /var/vclusters/$VCNAME; kubectl create ns $VCNAME; printf "apiVersion: traefik.containo.us/v1alpha1\nkind: IngressRouteTCP\nmetadata:\n namespace: "$VCNAME"\n name: "$VCNAME"\nspec:\n entryPoints:\n - websecure\n routes:\n - match: HostSNI(`"$VCNAME"."$DOMAIN"`)\n services:\n - name: $VCNAME\n port: 443\n tls:\n passthrough: true\n" > /var/vclusters/$VCNAME/$VCNAME-ingressroutetcp.yaml; kubectl apply -f /var/vclusters/$VCNAME/$VCNAME-ingressroutetcp.yaml; printf 'rbac:\n clusterRole:\n create: true\nvcluster:\n image: rancher/k3s:v1.20.6-k3s1\n extraArgs:\n - --disable=servicelb\nsyncer:\n extraArgs:\n - --sync-all-nodes\n - --sync-node-changes\n - --fake-nodes=false\n - --disable-sync-resources=ingresses\n - --enable-storage-classes\n - --out-kube-config-server="https://'"$VCNAME"'.'"$DOMAIN"'"\n - --tls-san='"$VCNAME"'.'"$DOMAIN"'\n' > /var/vclusters/$VCNAME/vcluster.yaml helm upgrade vcluster-1 vcluster --install --create-namespace --repository-config='' --repo https://charts.loft.sh --namespace vcluster-1 --values /var/vclusters/vcluster-1/vcluster.yaml

FabianKramm commented 2 years ago

@Vertiwell thanks for the information! Sorry I'm not sure if I understand your problem correctly, vcluster does not create the pods with any limitations in the host cluster, we currently support --node-selector label1=value1 with which you can tell vcluster to use this node selector for all new pods, however in general we recommend to use an admission controller like OPA, kyverno or jsPolicy to set any specific restrictions you want to apply to the created pods as this would be much more flexible.

Vertiwell commented 2 years ago

@FabianKramm thanks for the info, I'm starting to lean towards this perhaps being an issue with Calico on the underlying cluster and not with the node taint, I'll do some more testing. Nevermind, not an issue with Calico or node taint, without either of these, fresh install of K3S with metallb and Traefik v2, then install vcluster and coredns sits forever broken.

Vertiwell commented 2 years ago

Issue was with the Traefik v2 ingressroute, as above I was allowing traffic via IngressRouteTCP, but not allowing http traffic via IngressRoute, once I added an IngressRoute, everything started working. Cheers.