kubernetes-sigs / kind

Kubernetes IN Docker - local clusters for testing Kubernetes
https://kind.sigs.k8s.io/
Apache License 2.0
13.46k stars 1.56k forks source link

WSL environment needs additional cgroupv2 configuration #3685

Closed tppalani closed 2 months ago

tppalani commented 3 months ago

What happened:

I'm have created kind cluster using kindest node with base image using config.yaml i can control plane and node are in read state but when i see application pod i can see some error related i don't see this error in older release version kindest/node:v1.27.

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

logs

combined from similar events): Liveness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "d755ebc8e8c9df8392a0819a9a9873f8b3dc92bc8adc2e0cb78a373effb85c2d": OCI runtime exec failed: exec failed: unable to start container process: error adding pid 655569 to cgroups: failed to write 655569: openat2 /sys/fs/cgroup/unified/kubelet.slice/kubelet-kubepods.slice/kubelet-kubepods-burstable.slice/kubelet-kubepods-burstable-pod9aca007f_efe0_4ef4_98f7_17751468c3e5.slice/cri-containerd-1872cb535b10d6ae6b00f2e0891

Environment:


- OS (e.g. from `/etc/os-release`): windows 11
- Kubernetes version: (use `kubectl version`): ``` Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.0```
- Any proxies or other special environment settings?: NA
aojea commented 3 months ago

use the latest stable version from kind please and report back, also it seems you are using cgroupsv1 that has known issues

https://github.com/kubernetes-sigs/kind/issues/3558#issuecomment-2040823712

tppalani commented 3 months ago

May I know what is the stable version?

aojea commented 3 months ago

May I know what is the stable version?

the last one https://github.com/kubernetes-sigs/kind/releases , 0.23.0 in this case

BenTheElder commented 3 months ago

OS (e.g. from /etc/os-release): windows 11

WSL2 + rootless podman is uncharted territory for us, but see these ~user contributed guides as well: https://kind.sigs.k8s.io/docs/user/using-wsl2/ https://kind.sigs.k8s.io/docs/user/rootless/

stmcginnis commented 3 months ago

I think cgroupv1 is going to be an issue, right?

cgroupVersion: v1
tppalani commented 3 months ago

Hi @stmcginnis Yes even i have tried with other image as well which is suggest by @aojea - image: kindest/node:v1.30.0@sha256:047357ac0cfea04663786a612ba1eaba9702bef25227a794b52890dd8bcd692e still same issue

Startup probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "9f88c516c2affe0d0a6255ed2b4a7ba3400a453753e21c4c8550227d1bbb4332": OCI runtime exec failed: exec failed: unable to start container process: error adding pid 7115 to cgroups: failed to write 7115: openat2 /sys/fs/cgroup/unified/kubelet.slice/kubelet-kubepods.slice/kubelet-kubepods-besteffort.slice/kubelet-kubepods-besteffort-pod77edb64a_8046_4a7d_a69c_a57762b51e74.slice/cri-containerd-012ad1d06d571fd9c144addb4de8ab3c2eba9fa5490132fb0fa1e191bd11ab9f.scope/cgroup.procs: no such file or directory: unknown
stmcginnis commented 3 months ago

OK, that does look like it may be part of the issue. I believe you will need to set up your WSL environment to be using cgroupv2.

tppalani commented 3 months ago

I'm really sorry i'm using windows machine how can i set wsl environment with cgroupv2.

$ wsl -l -v
  NAME                      STATE           VERSION
* podman-machine-default    Running         2
  podman-net-usermode       Running         2
tppalani commented 3 months ago

And here the data inside podman machine ssh

# fstab intentionally empty for containers
/run/user/1000/podman/podman.sock /mnt/wsl/podman-sockets/podman-machine-default/podman-user.sock none noauto,user,bind,defaults 0 0
stmcginnis commented 3 months ago

Sorry, no idea as I haven't used WSL or Windows for a number of years now, but this looks like it may have some useful information: https://github.com/spurin/wsl-cgroupsv2

BenTheElder commented 3 months ago

cgroup v1 can work but needs cgroupns suport.

I would suggest using something like lima or docker desktop with docker instead, follow https://kind.sigs.k8s.io/docs/user/using-wsl2/

BenTheElder commented 3 months ago

The failure to write cgroups isn't in kind, that's coming from podman after we ask it to create the container.

BenTheElder commented 3 months ago

I would guess cgroupns issues on this linux guest environment

tppalani commented 3 months ago

Hi @BenTheElder do I need to change any configuration from my side to make it work?

tppalani commented 3 months ago

Hi @BenTheElder @aojea

I don't think this cgroup issue because when i have used below image i can see all the pods and up and running without any error message, do you have idea about this still i'm using cgroup v1 only according above podman info output.

kindest/node:v1.27.1@sha256:c44686bf1f422942a21434e5b4070fc47f3c190305be2974f91444cd34909f1b
rbngzlv commented 3 months ago

Seems that I hit the same problem when trying to create a cluster on WSL 2 using the image kindest/node:v1.30 and fixed it switching wsl to use cgroupsv2 as pointed by comments (although I'm using docker and not podman).

Thank y'all for the hints.

BenTheElder commented 3 months ago

Hi @BenTheElder do I need to change any configuration from my side to make it work?

https://github.com/kubernetes-sigs/kind/issues/3685#issuecomment-2229228988

I recommend using a better supported platform than kind-on-podman-on-wsl2

Kubernetes uses docker on Linux primarily, some contributors use it on macOS.

podman on WSL2 with cgroup v1 is probably the worst supported combination of options in the ecosystem and I can't personally replicate this, I'm not a windows user, and no windows users have helped us figure out a workable CI approach (e.g. previously we tried actions but could not run docker or podman in that environment).

BenTheElder commented 3 months ago

I don't think this cgroup issue because when i have used below image i can see all the pods and up and running without any error message, do you have idea about this still i'm using cgroup v1 only according above podman info output.

This is difficult to debug over github when we receive partial information, for example you say you're using this image but not with what kind version / environment, and with only excerpts from the logs.

Have you looked at the suggestions above, including e.g. the complete guide for using WSL2? https://github.com/kubernetes-sigs/kind/issues/3685#issuecomment-2228848747

Seems that I hit the same problem when trying to create a cluster on WSL 2 using the image kindest/node:v1.30 and fixed it switching wsl to use cgroupsv2 as pointed by comments (although I'm using docker and not podman).

Yes, I would highly recommend this. You can't create cgroup v2 clusters with Kubernetes < 1.19 but that's long out of support anyhow. Cgroup v2 is maturing and will be the focus for the ecosystem going forward, and in particular makes nested containers a lot more straightforward by typically having cgroupns enabled by default + the unified hierarchy.

tppalani commented 3 months ago

Seems that I hit the same problem when trying to create a cluster on WSL 2 using the image kindest/node:v1.30 and fixed it switching wsl to use cgroupsv2 as pointed by comments (although I'm using docker and not podman).

Thank y'all for the hints.

How did you fixed? Are you using mac book or windows system?

rbngzlv commented 3 months ago

How did you fixed? Are you using mac book or windows system?

I was able to create a cluster configuring WSL 2 to use cgroup v2 instead of cgroup v1, following the instructions in the readme of the repo shared in https://github.com/kubernetes-sigs/kind/issues/3685#issuecomment-2229225239.

tppalani commented 3 months ago

We are good close this ticket issue has been resolved.

$ podman run -it --rm spurin/wsl-cgroupsv2:latest
Success: cgroup type is cgroup2fs

AL44469@LDD4C6G3 MINGW64 ~/AWSCLI
$ podman info
host:
  arch: amd64
  buildahVersion: 1.36.0
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: cgroupfs
  cgroupVersion: v2
tppalani commented 3 months ago

Thanks for all the contributions and guidelines

stmcginnis commented 3 months ago

Glad you got it working!

/close

k8s-ci-robot commented 3 months ago

@stmcginnis: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/kind/issues/3685#issuecomment-2237112871): >Glad you got it working! > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
BenTheElder commented 3 months ago

Awesome!

We should probably add a note pointing to the WSL2 cgroupv2 guide in the WSL2 page?

BenTheElder commented 3 months ago

Thanks all

stmcginnis commented 3 months ago

We should probably add a note pointing to the WSL2 cgroupv2 guide in the WSL2 page?

Good point, we really should capture that.

/reopen /retitle WSL environment needs additional cgroupv2 configuration /remove-kind bug /kind documentation

k8s-ci-robot commented 3 months ago

@stmcginnis: Reopened this issue.

In response to [this](https://github.com/kubernetes-sigs/kind/issues/3685#issuecomment-2237234032): >> We should probably add a note pointing to the WSL2 cgroupv2 guide in the WSL2 page? > >Good point, we really should capture that. > >/reopen >/retitle WSL environment needs additional cgroupv2 configuration >/remove-kind bug >/kind documentation Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.