Closed pmalek closed 9 months ago
mac-jan:my-question-generator jan$ colima ssh ls /sys/fs/group ls: /sys/fs/group: No such file or directory
@janvda The directory is called
cgroup
, notgroup
Sorry - my fault.
I have re-executed command using correct directory:
mac-jan:my-question-generator jan$ colima ssh
colima:/Users/jan/Documents/15_iot/nuc/my-question-generator$ ls /sys/fs/cgroup
acpid cpuacct docker lima-guestagent net_prio perf_event sshd
blkio cpuset freezer memory networking pids udev-postmount
cpu devices hugetlb net_cls openrc qemu-binfmt unified
colima:/Users/jan/Documents/15_iot/nuc/my-question-generator$
That is still the "hybrid" layout. Not sure what colima is doing that breaks this. Is there a /etc/conf.d/cgroups
file, and if yes, what is the content?
That is still the "hybrid" layout. Not sure what colima is doing that breaks this. Is there a
/etc/conf.d/cgroups
file, and if yes, what is the content?
No, there is no such file.
colima:/etc/conf.d$ ls
bootmisc devfs fsck killprocs logrotate net-online rdate swap udev-settle
consolefont dmesg hwclock klogd modloop netmount seedrng swclock udev-trigger
containerd docker ip6tables loadkmap modules ntpd sshd syslog watchdog
crond ebtables iptables localmount mtab qemu-binfmt staticroute udev
colima:/etc/conf.d$
I discovered that it is possible to use ubuntu for colima (see colima FAQ) by following command: colima start --layer=true
.
The command colima ssh more /etc/os-release
shows that it is indeed running ubuntu.
Here the output of that command:
PRETTY_NAME="Ubuntu 23.04"
NAME="Ubuntu"
VERSION_ID="23.04"
VERSION="23.04 (Lunar Lobster)"
VERSION_CODENAME=lunar
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=lunar
LOGO=ubuntu-logo
But this didn't fix the problem. I am still getting same error when building:
=> ERROR [2/3] RUN apt-get update -y && apt-get install -y git nano wget && pip install --upgrade pip 0.2s
------
> [2/3] RUN apt-get update -y && apt-get install -y git nano wget && pip install --upgrade pip:
#0 0.204 runc run failed: unable to start container process: error during container init: error mounting "cgroup" to rootfs at "/sys/fs/cgroup": mount cgroup:/sys/fs/cgroup/openrc (via /proc/self/fd/6), flags: 0xf, data: openrc: invalid argument
I am actually wondering if we are looking at the right location.
The build is happening in the buildkit container (image moby/buildkit:buildx-stable-1
).
So I would think that the problem is in this container.
I also think that the problem started when it pulled a new version of this image from docker-hub.
Maybe this container image is broken/incompatible ?
I discovered that it is possible to use ubuntu for colima (see colima FAQ) by following command: colima start --layer=true.
That's a userspace image on top of the VM, not the VM OS. You can see from your error message that you're still on openrc on the underlying cgroups.
You can start an ubuntu VM with https://github.com/lima-vm/lima instead (which colima is built on), please see previous comments https://github.com/kubernetes-sigs/kind/issues/3277#issuecomment-1692178393.
You can start an ubuntu VM with https://github.com/lima-vm/lima instead (which colima is built on), please see previous comments #3277 (comment).
Thanks, switching to limactl start template://docker
fixed my issue. I am now again able to build docker images without errors.
Do not want to duplicate issues. Running on MacOS Ventura 13.5.1.
Kind version
â ī¸ kind --version
> kind version 0.20.0
$ kind create cluster --config=config/kind/main.yaml
>
Creating cluster "kind-local" ...
â Ensuring node image (kindest/node:v1.27.3) đŧ
â Preparing nodes đĻ đĻ
Deleted nodes: ["kind-local-control-plane" "kind-local-worker"]
ERROR: failed to create cluster: command "docker run --name kind-local-control-plane --hostname kind-local-control-plane --label io.x-k8s.kind.role=control-plane --privileged --security-opt seccomp=unconfined --security-opt apparmor=unconfined --tmpfs /tmp --tmpfs /run --volume /var --volume /lib/modules:/lib/modules:ro -e KIND_EXPERIMENTAL_CONTAINERD_SNAPSHOTTER --detach --tty --label io.x-k8s.kind.cluster=kind-local --net kind --restart=on-failure:1 --init=false --cgroupns=private --publish=0.0.0.0:30070:30080/TCP --publish=127.0.0.1:62681:6443/TCP -e KUBECONFIG=/etc/kubernetes/admin.conf kindest/node:v1.27.3@sha256:3966ac761ae0136263ffdb6cfd4db23ef8a83cba8a463690e98317add2c9ba72" failed with error: exit status 125
Command Output: a7174e21d76791171c521a8b7fd09e4fd2122f8f602d0735204f58073478078f
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "cgroup" to rootfs at "/sys/fs/cgroup": mount cgroup:/sys/fs/cgroup/openrc (via /proc/self/fd/7), flags: 0xe, data: openrc: invalid argument: unknown.
Docker info
â ī¸ docker info
Client:
Version: 24.0.2-rd
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.11.0
Path: /Users/ik/.docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.19.0
Path: /Users/ik/.docker/cli-plugins/docker-compose
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 22
Server Version: 23.0.6
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 1fbd70374134b891f97ce19c70b6e50c7b9f4e0d
runc version: 860f061b76bb4fc671f0f9e900f7d80ff93d4eb7
init version:
Security Options:
seccomp
Profile: builtin
Kernel Version: 6.1.32-0-virt
Operating System: Alpine Linux v3.18
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 5.798GiB
Name: lima-rancher-desktop
ID: JL2Y:IUE7:SXIV:CD7T:LS7D:PUWN:PAUE:TB6O:ELJP:7JVT:K67A:OSBM
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Rollback to 0.19
$ go install sigs.k8s.io/kind@v0.19.0
$ kind --version
> kind version 0.19.0
$ kind create cluster --config=config/kind/main.yaml
> Creating cluster "kind-local" ...
â Ensuring node image (kindest/node:v1.27.1) đŧ
â Preparing nodes đĻ đĻ
â Writing configuration đ
â Starting control-plane đšī¸
â Installing CNI đ
â Installing StorageClass đž
â Joining worker nodes đ
Set kubectl context to "kind-kind-local"
You can now use your cluster with:
kubectl cluster-info --context kind-kind-local
Colima v0.6.0 supports kind https://github.com/abiosoft/colima/releases/tag/v0.6.0
Thanks @abiosoft!
@abiosoft does this mean it now also works with latest Rancher Desktop?
@marcofranssen No, it does not. colima switched from Alpine to Ubuntu to avoid the issue, but Rancher Desktop still uses Alpine.
The best you can do on Rancher Desktop right now is to use k3d instead of kind
. It should provide very similar functionality, but uses k3s
instead of kubeadm
internally.
The best you can do on Rancher Desktop right now is to use k3d instead of kind. It should provide very similar functionality, but uses k3s instead of kubeadm internally.
Off-topic question, but why not use Rancher Desktop's Kubernetes? đ What are missing in Rancher Desktop's Kubernetes? (Setting custom feature gates, etc.?)
Off-topic question, but why not use Rancher Desktop's Kubernetes? đ
For me the only reason to use k3d
is when I want to have a multi-node cluster to play around with pod placement strategies like taints and affinity, to make sure the manifests work as expected.
Eventually there should be a config setting in Rancher Desktop to allow multiple nodes. Personally I've also wanted a mixed-architecture cluster with both amd64 and arm64 nodes, but that is more for fun than actual need... đ
Multi-node is one of the common reasons I see versus the bundled k8s in containers-in-a-vm solutions, the other is more control over the k8s version used.
the other is more control over the k8s version used.
You can pick any k8s (k3s) version you want in Rancher Desktop and you can also upgrade to any new version and see how it affects your deployed workloads:
I'm not actually sure if versions prior to 1.19 still work properly, but all the more recent releases should be fully functional.
To add one more data point to the issues with Alpine (under Rancher Desktop), this is the output that I get from kind after it fails to work...
INFO: ensuring we can execute mount/umount even with userns-remap
INFO: remounting /sys read-only
INFO: making mounts shared
INFO: detected cgroup v1
INFO: detected cgroupns
INFO: clearing and regenerating /etc/machine-id
Initializing machine ID from random generator.
INFO: faking /sys/class/dmi/id/product_name to be "kind"
INFO: faking /sys/class/dmi/id/product_uuid to be random
INFO: faking /sys/devices/virtual/dmi/id/product_uuid as well
INFO: setting iptables to detected mode: legacy
INFO: detected IPv4 address: 172.18.0.2
INFO: detected IPv6 address: fc00:f853:ccd:e793::2
INFO: starting init
Inserted module 'autofs4'
Failed to mount cgroup at /sys/fs/cgroup/systemd: Operation not permitted
[!!!!!!] Failed to mount API filesystems.
Exiting PID 1...
Right, there's discussion of this above /sys/fs/cgroup
, we should have permission to mount here in this privileged container so ... something is odd/broken in that environment.
I can't run rancher desktop at work (VM policy) so I'd appreciate others that use rancher desktop debugging this issue.
Er and to clarify we have code specifically to ensure things run smoothly on non-systemd hosts:
However, on these particular alpine based hosts we seem to be unable to make mounts, which doesn't make sense. With cgroupns enabled we're getting our own view of cgroups and with privileged we should have permission to make mounts (see e.g. the remount /sys ro earlier in the logs). It's possible we can't make this mount in any environment and receive it as a function of systemd being on the host on other hosts, this requires more root-cause debugging.
I still haven't had time to dig into this myself, currently focused on some follow-ups around https://kubernetes.io/blog/2023/08/31/legacy-package-repository-deprecation/, and this is somewhat outside of @aojea's usual wheelhouse.
In the meantime I recommend lima w/ ubuntu docker profile or colima as free alternatives to docker desktop that work with kind.
I would appreciate help in investigating this bug.
cgroupns will be default on cgroupsv2 hosts under all major container runtimes and is enabled for good reasons, so just reverting enabling cgroupns in an attempt to unbreak alpine isn't a very good option (note: rancher desktop is on v2 with cgroupns enabled by default now anyhow), but I'd love to see other suggested fixes or debugging work from anyone else invested in this support.
Just wanted to give a quick heads-up that the issue seems to be fixed by Alpine 3.19 (most likely due to the update to OpenRC 0.51+, which has fixed the "unified" cgroups layout):
$ kind create cluster
Creating cluster "kind" ...
â Ensuring node image (kindest/node:v1.27.3) đŧ
â Preparing nodes đĻ
â Writing configuration đ
â Starting control-plane đšī¸
â Installing CNI đ
â Installing StorageClass đž
Set kubectl context to "kind-kind"
You can now use your cluster with:
kubectl cluster-info --context kind-kind
Have a nice day! đ
$ k get no
NAME STATUS ROLES AGE VERSION
kind-control-plane NotReady control-plane 11s v1.27.3
So this issue can probably be closed, unless you want to wait until a version of Rancher Desktop with Alpine 3.19 is out for verification. That is probably not going to happen until early March though.
So this issue can probably be closed, unless you want to wait until a version of Rancher Desktop with Alpine 3.19
/close
let's close it here, is not anything else we can do and you provided a solution
@aojea: Closing this issue.
This issue is closed, but there is still an open issue in rancher desktop - it's hidden in the collapsed comments, so linking it here again https://github.com/rancher-sandbox/rancher-desktop/issues/5092
Circling back, we have reports of rancher desktop + kind v0.23 working in https://kubernetes.slack.com/archives/CEKK1KTN2/p1723583621985329?thread_ts=1723579586.749849&cid=CEKK1KTN2
FYI @jandubois đ
NOTE: you may still run into issues from https://kind.sigs.k8s.io/docs/user/known-issues/, in this case with many clusters, tuning inotify limits was required https://kind.sigs.k8s.io/docs/user/known-issues/#pod-errors-due-to-too-many-open-files
(it might? be reasonable to bump the defaults in rancher desktop đ )
What happened:
After updating to v0.20.0 I cannot create a cluster anymore.
I'm using Mac with colima
What you expected to happen:
No error and cluster creates successfully
How to reproduce it (as minimally and precisely as possible):
Environment:
kind version: (use
kind version
): v0.20.0Runtime info: (use
docker info
orpodman info
):OS (e.g. from
/etc/os-release
): Mac OS with colima VM./etc/os-release
from within the VM that hosts the docker daemon: