kubernetes-sigs / kind

Kubernetes IN Docker - local clusters for testing Kubernetes
https://kind.sigs.k8s.io/
Apache License 2.0
13.43k stars 1.55k forks source link

Error 'cpu.weight: no such file or directory: unknown' when starting control plane in WSL2, kernel 5.15 #3179

Closed zapho closed 1 year ago

zapho commented 1 year ago

What happened:

Starting kind does not work anymore in my WSL2 environment. kind create cluster --loglevel=debug --retain

leads to the following output:

Creating cluster "kind" ...
 āœ“ Ensuring node image (kindest/node:v1.25.3) šŸ–¼
 āœ“ Preparing nodes šŸ“¦ šŸ“¦ šŸ“¦ šŸ“¦
 āœ“ Writing configuration šŸ“œ
 āœ— Starting control-plane šŸ•¹ļø
ERROR: failed to create cluster: failed to init node with kubeadm: command "docker exec --privileged kind-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1

It look familiar with https://github.com/kubernetes-sigs/kind/issues/2731 but I'm not using a 5.17+ kernel.

Environment: WSL2, kind 0.18

uname -a
#Linux DUC-Lqp0bH3hPUL 5.15.90.1-microsoft-standard-WSL2 #1 SMP Fri Jan 27 02:56:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
$ kind version
#kind v0.18.0 go1.20.2 linux/amd64

3029766531.tar.gz

What you expected to happen: kind starting a cluster without error.

How to reproduce it (as minimally and precisely as possible): Get WSL2 (see below)

Use a Fedora 36 distribution (see below)

Start kind kind create cluster

Anything else we need to know?: Was working flawlessly with the same Fedora version 2 weeks ago. I suspect that WSL2 got a kernel update since then that might have triggered this issue.

Environment:

wsl --version
Version WSL : 1.2.0.0
Version du noyau : 5.15.90.1
Version WSLg : 1.0.51
Version MSRDC : 1.2.3770
Version direct3D : 1.608.2-61064218
Version de DXCore : 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Version de Windows : 10.0.19045.2728
cat /etc/*release*
Generic release 36 (Generic)
NAME="Fedora Remix for WSL"
VERSION="36"
ID=fedoraremixforwsl
ID_LIKE=fedora
VERSION_ID=36
PLATFORM_ID="platform:f36"
PRETTY_NAME="Fedora Remix for WSL"
ANSI_COLOR="0;34"
CPE_NAME="cpe:/o:fedoraproject:fedora:36"
HOME_URL="https://github.com/WhitewaterFoundry/Fedora-Remix-for-WSL"
SUPPORT_URL="https://github.com/WhitewaterFoundry/Fedora-Remix-for-WSL"
BUG_REPORT_URL="https://github.com/WhitewaterFoundry/Fedora-Remix-for-WSL/issues"
PRIVACY_POLICY_URL="https://github.com/WhitewaterFoundry/Fedora-Remix-for-WSL/blob/master/PRIVACY.md"
FEDORA_REMIX_VERSION=36.0.5
Generic release 36 (Generic)
Generic release 36 (Generic)
cpe:/o:generic:generic:36
docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.9.1-docker)
  compose: Docker Compose (Docker Inc., v2.10.2)
  scan: Docker Scan (Docker Inc., v0.17.0)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 55
 Server Version: 20.10.18
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6
 runc version: v1.1.4-0-g5fd4c4d
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.15.90.1-microsoft-standard-WSL2
 Operating System: Fedora Remix for WSL
 OSType: linux
 Architecture: x86_64
 CPUs: 5
 Total Memory: 23.48GiB
 Name: DUC-Lqp0bH3hPUL
 ID: MN7P:YJ4X:3IH6:RR75:R2NO:Y6DS:JYE6:F6GX:H46N:NKMN:CMJ4:AIZA
 Docker Root Dir: /var/lib/docker
 Debug Mode: true
  File Descriptors: 24
  Goroutines: 34
  System Time: 2023-04-19T10:05:19.002180713+02:00
  EventsListeners: 0
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Registry Mirrors:
  https://registry-nexus.orbis.dedalus.com/
 Live Restore Enabled: false
> kubectl version
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.3", GitCommit:"434bfd82814af038ad94d62ebe59b133fcb50506", GitTreeState:"clean", BuildDate:"2022-10-12T10:57:26Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
BenTheElder commented 1 year ago

āœ“ Ensuring node image (kindest/node:v1.25.3) šŸ–¼

kind create cluster

Environment: WSL2, kind 0.18

Hmmm, these things don't match? KIND v0.18 defaults to a 1.26 image

BenTheElder commented 1 year ago

Re: https://github.com/kubernetes-sigs/kind/issues/2731, we intermittently get issues with users on odd Kernels that are missing cgroups functionality or have broken cgroups, I wouldn't expect this on fedora however, though last I checked WSL did typically involve a windows specific init instead of systemd (which would be setting up the cgroups).

BenTheElder commented 1 year ago

I think this is https://github.com/kubernetes-sigs/kind/issues/3165#issuecomment-1513665114

BenTheElder commented 1 year ago

Check https://github.com/kubernetes-sigs/kind/issues/3180, we have an issue with WSL2 1.2.0.0 that is fixed in a later WSL2 version.

zapho commented 1 year ago

Fixed after update of WSL2 version (1.2.5.0). Kernel version unchanged after update.

wsl --version
Version WSL : 1.2.5.0
Version du noyau : 5.15.90.1
Version WSLg : 1.0.51
Version MSRDC : 1.2.3770
Version direct3D : 1.608.2-61064218
Version de DXCore : 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Version de Windows : 10.0.19045.2728