loft-sh / vcluster

vCluster - Create fully functional virtual Kubernetes clusters - Each vcluster runs inside a namespace of the underlying k8s cluster. It's cheaper than creating separate full-blown clusters and it offers better multi-tenancy and isolation than regular namespaces.
https://www.vcluster.com
Apache License 2.0
6.26k stars 398 forks source link

Cannot create vcluster neither inside k3d nor inside kind on Linux #264

Closed pgagarinov closed 2 years ago

pgagarinov commented 2 years ago

vcluster pod fails with the following error:

time="2021-12-24T17:56:38.949209082Z" level=fatal msg="failed to evacuate root cgroup: mkdir /sys/fs/cgroup/init: read-only file system"

❯ vcluster --version vcluster version 0.5.0-beta.0

❯ kind --version kind version 0.11.1

❯ k3d --version k3d version v5.2.2 k3s version v1.21.7-k3s1 (default)

matskiv commented 2 years ago

@pgagarinov are you running your container runtime in a rootless configuration? Perhaps running vcluster as non-root would help in this case. Here are docs for that - https://www.vcluster.com/docs/operator/restricted-hosts#running-as-non-root-user

pgagarinov commented 2 years ago

@matskiv no, both docker and k3d run as root.

pgagarinov commented 2 years ago

@matskiv After reading https://golangissues.com/issues/931164 I tried to create vcluster using other distributions. Both k0s and k8s work, only k3s distro fails.

matskiv commented 2 years ago

@pgagarinov Thanks for trying different options, it helped to narrow down the issue. The problem might have been fixed in this k3s PR - https://github.com/k3s-io/k3s/pull/4086 Could you try bumping the k3s image to rancher/k3s:v1.22.2-k3s2, or latest stable v1.22 - rancher/k3s:v1.22.5-k3s1? You can do so by editing the vcluster StatefulSet or by adding image override to your values.yaml file e.g.:

vcluster:
  image: rancher/k3s:v1.22.5-k3s1

and then passing this file to helm command or vcluster create your-vcluster-name -f values.yaml, depending on how you install vcluster.

pgagarinov commented 2 years ago

@matskiv I tried both images - nothing helps, same error (I checked that the correct image is pulled though).

matskiv commented 2 years ago

@pgagarinov Thank you for testing it. It is certainly disappointing to hear that it was not fixed :/ Are you okay with using k0s or k8s for now? We will try to get back to this issue, but I am not sure where it will land in terms of priorities... A few additional bits of info would help us with investigation or reproduction of the issue: What is your operating system (incl. version)? What container runtime(docker/podman/etc) are you using and which version? Any special configuration of container runtime or your host k8s?

pgagarinov commented 2 years ago

@matskiv sure, I'll use k0s for now.

I use docker, Manjaro Linux with kernel 5.10, no special configuration.

❯ docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Build with BuildKit (Docker Inc., v0.6.1-docker)
  compose: Docker Compose (Docker Inc., 2.2.2)

Server:
 Containers: 63
  Running: 9
  Paused: 0
  Stopped: 54
 Images: 87
 Server Version: 20.10.11
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 1e5ef943eb76627a6d3b6de8cd1ef6537f393a71.m
 runc version: v1.0.3-0-gf46b6ba2
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.10.84-1-MANJARO
 Operating System: Manjaro Linux
 OSType: linux
 Architecture: x86_64
 CPUs: 12
 Total Memory: 31.26GiB
 Name: darkstar
 ID: GOE6:MPR4:B7DF:5I26:DX3L:UMLD:RAN7:II4R:5SJH:4C2A:GWZI:32H3
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
antonioberben commented 2 years ago

+1 . My main cluster is a k3s (tried multiple versions) where I try to install my vcluster (k3s distro). However, installing vanilla k8s seems to work: Fail:

sudo curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--disable=traefik,servicelb" INSTALL_K3S_VERSION="v1.22.2-rc1+k3s2" sh -
./vcluster-0.5.0-beta.0 create my-vcluster -n my-vcluster --distro k3s --kubernetes-version v1.20.13

Error:

vcluster time="2021-12-xxxxxxxxx" level=fatal msg="failed to evacuate root cgroup: mkdir /sys/fs/cgroup/init: read-only file system"

Work:

sudo curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--disable=traefik,servicelb" INSTALL_K3S_VERSION="v1.22.2-rc1+k3s2" sh -
./vcluster-0.5.0-beta.0 create my-vcluster -n my-vcluster --distro k8s --kubernetes-version v1.20.13

Notice that the only difference is the --distro

Thank you

EDITED: @pgagarinov , deploying a vcluster (k3s flavor) on top of a main k3s cluster works for me with the v1.21.4+k3s1 version

FabianKramm commented 2 years ago

@antonioberben does it work with the officially released version v1.22.5+k3s1 instead of the release candidate version v1.22.2-rc1+k3s2?

antonioberben commented 2 years ago

@FabianKramm , it doesn't. I tried several versions. I cannot remember which ones. But I started from the last one backwards. I tried 3 or 4 version and then I went directly to the one it worked for me: v1.21.4+k3s1

pgagarinov commented 2 years ago

Thanks a lot @antonioberben!

@matskiv I confirm that v1.21.4+k3s1 works while all of v1.22.x I tried don't.

FabianKramm commented 2 years ago

I created an issue for this in the k3s repo #4873 to track this

FabianKramm commented 2 years ago

We'll remove support for k3s v1.22 in the next vcluster version v0.5.1 until this is fixed or a workaround is available and recommend to meanwhile use v1.21 k3s (which is then used automatically by vcluster cli) or to use k0s or vanilla k8s v1.22 distros instead

FabianKramm commented 2 years ago

@antonioberben @pgagarinov this should be fixed with the new v0.6.0-alpha.3 version and the new k3s images, current stable version v0.5.1 should also work, but currently does not support the newer k3s images v1.22 or higher.

FabianKramm commented 2 years ago

I'm closing this, feel free to reopen if it's still an issue