loft-sh / vcluster

vCluster - Create fully functional virtual Kubernetes clusters - Each vcluster runs inside a namespace of the underlying k8s cluster. It's cheaper than creating separate full-blown clusters and it offers better multi-tenancy and isolation than regular namespaces.
https://www.vcluster.com
Apache License 2.0
6.32k stars 403 forks source link

vcluster crashloop #45

Closed AndreasDeCrinis closed 3 years ago

AndreasDeCrinis commented 3 years ago

Hi,

I installed loft today on our onprem upstream kubernetes 1.19.10 cluster without any problem. Afterwards, I tried to create a vcluster which crashes constantly

I see lots of timeouts in the logs yet apiservices seem to be up and running

v1.admissionregistration.k8s.io failed with : Timeout: request did not complete within requested timeout 34s v1beta1.apiextensions.k8s.io failed with : Timeout: request did not complete within requested timeout 34s v1.apiextensions.k8s.io failed with : context deadline exceeded v1beta1.admissionregistration.k8s.io failed with : context deadline exceeded

loft.log

help would be highly appreciated

FabianKramm commented 3 years ago

@AndreasKappel thanks for creating this issue! What storage provider do you use in your Kubernetes cluster? We saw errors like this in clusters where there were problems with persistent storage being very slow.

FabianKramm commented 3 years ago

Worth a try could be to test vcluster in non-persistence mode and see if the same error occurs. You can do this by creating a values.yaml with:

storage:
  persistence: false

Then create the vcluster with the following command (make sure you use vcluster CLI version v0.3.0-beta.0 or above):

vcluster create test -n test -f values.yaml

If that works without any problems, the storage provider is definitely the issue

AndreasDeCrinis commented 3 years ago

disabling the persistence did the trick - but why? Storage Provisioning is working fine for the rest of the projects on the cluster

FabianKramm commented 3 years ago

@AndreasKappel thanks for the answer! As k3s uses sqlite by default, a rather fast persistent storage is required or otherwise requests will hang and timeout as k3s is writing and reading quite quickly from the sqlite database. You can also try to use a mysql, postgresql or etcd data storage backend instead of sqlite, we have a guide in the docs for this: https://www.vcluster.com/docs/operator/external-datastore

AndreasDeCrinis commented 3 years ago

ah ok then I will actually try our all flash storage class with super low latency. thanks for the hint with the external datastore. This might be an even better option as it can run in HA mode. I will close the issue