docker / for-linux

Docker Engine for Linux
https://docs.docker.com/engine/installation/
753 stars 85 forks source link

/sys/fs/cgroup/cpuset/docker/cpuset.cpus: no such file or directory due to noprefix mount option of CGROUP #689

Open sophy228 opened 5 years ago

sophy228 commented 5 years ago

Sorry, I have to re-report the issue same as https://github.com/moby/moby/issues/33594

Since I have not found the formal solution about it.

On the OS (Android , Chrome) which mount CPUSET with NO PREFIX option, we only have cpus, mems rather than cpuset.cpus or cpuset.mems.

I don't want change the OS behavior, but want make some fix on Docker. While I face some problems:

  1. the error log is

docker: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:275: applying cgroup configuration for process caused \"open /sys/fs/cgroup/cpuset/docker/cpuset.cpus: no such file or directory\"": unknown.

container_linux.go:345 : I can find container_linux.go in the docker-ce project, but there is no line 345.. process_linux.go:275: I can not find such file at all!

So How did this error logs mean? How can I find the error spot ?

  1. I tried to modify the components/engine/pkg/sysinfo/sysinfo_linux.go and components/engine/vendor/github.com/containerd/cgroups/cpuset.go

I changed all “cpuset.xxx" to "xxx", but still failed. and I add logs in /vendor/github.com/containerd/cgroups/cpuset.go, seems server did not call the functions in the file before failure...

So How can I find the error spot, where should I modify

Thanks !

stealthybox commented 5 years ago

I'm working on fixing this.

sophy228 commented 5 years ago

@stealthybox any progress now?

sophy228 commented 5 years ago

I have submitted a fix for runc, and now it can work in my case.

However, I found the similar codes in docker engine and containerd.

I could have fixed them as well. But I don't know how to trigger the issue and how to test. It seems that it can now on my case with only runc fix.

No idea when and how docker engine and containerd access the cpuset

stealthybox commented 5 years ago

@sophy228 hi :wave: -- yes I have a set of working patches for all runc, dockerd, containerd, and cadvisor (kubelet dependency)

I don't like how they're all variations of the same logic. Some of it is a bit scary. You can definitely tell that some code was copied between these projects.

Nice patch! Looks like @cyphar welcomes a fix. In order to trigger the other higher level bugs on android, you need to use containerd+dockerd with your patched runc. Building for arm64 / armv7 can be a little tricky. I got some help on the docker community #mobyproject slack channel.

I did plan to polish off all of these patches and submit them -- let me know if you want to collab :+1:

madrisan commented 4 years ago

Hello. Same error after upgrading Fedora to version 31. Docker package: docker-ce-19.03.4-3.fc31.x86_64

AkihiroSuda commented 4 years ago

@madrisan

  sudo dnf install -y grubby && \
  sudo grubby \
  --update-kernel=ALL \
  --args=”systemd.unified_cgroup_hierarchy=0"`

Background: https://medium.com/nttlabs/cgroup-v2-596d035be4d7

Snawoot commented 4 years ago

@AkihiroSuda Thank you!

timakamystery commented 4 years ago
sudo dnf install -y grubby && \
  sudo grubby \
  --update-kernel=ALL \
  --args="systemd.unified_cgroup_hierarchy=0"

++ Had to restart OS Worked!

chainpioneer commented 4 years ago

All the ways did not work for me except this one:

  1. open /etc/default/grub as admin
  2. Append value of GRUB_CMDLINE_LINUX with systemd.unified_cgroup_hierarchy=0
  3. sudo -i
  4. sudo grub2-mkconfig > /boot/efi/EFI/fedora/grub.cfg or sudo grub2-mkconfig > /boot/grub2/grub.cfg
  5. sudo reboot now
macieg-b commented 4 years ago

All the ways did not work for me except this one:

1. open /etc/default/grub as admin

2. Append value of **GRUB_CMDLINE_LINUX** with `systemd.unified_cgroup_hierarchy=0`

3. `sudo -i`

4. `sudo grub2-mkconfig > /boot/efi/EFI/fedora/grub.cfg` or
   `sudo grub2-mkconfig > /boot/grub2/grub.cfg`

5. `sudo reboot now`

Works!

xilent commented 4 years ago

Same problem on docker run hello-world.

OCI runtime create failed: container_linux.go:346: starting container process caused "process_linux.go:297: applying cgroup configuration for process caused \"open /sys/fs/cgroup/docker/cpuset.cpus.effective: no such file or directory\"": unknown.

Client: Debug Mode: false

Server: Containers: 14 Running: 0 Paused: 0 Stopped: 14 Images: 3 Server Version: 19.03.5 Storage Driver: overlay2 Backing Filesystem: extfs Supports d_type: true Native Overlay Diff: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog Swarm: inactive Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: b34a5c8af56e510852c35414db4c1f4fa6172339 runc version: 3e425f80a8c931f88e6d94a8c831b9d5aa481657 init version: fec3683 Security Options: seccomp Profile: default Kernel Version: 5.3.16-300.fc31.x86_64 Operating System: Fedora 31 (Thirty One) OSType: linux Architecture: x86_64 CPUs: 4 Total Memory: 14.62GiB Name: jupiter ID: C6CE:TU7L:AEMW:QF3D:2AUT:DS5I:YKZQ:WGYV:ZQXB:7GNB:QNBN:MTOU Docker Root Dir: /var/lib/docker Debug Mode: true File Descriptors: 24 Goroutines: 37 System Time: 2019-12-19T20:02:16.672999703+01:00 EventsListeners: 0 Registry: https://index.docker.io/v1/ Labels: Experimental: false Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false

WARNING: No swap limit support WARNING: No kernel memory limit support WARNING: No kernel memory TCP limit support WARNING: No oom kill disable support WARNING: No cpu cfs quota support WARNING: No cpu cfs period support WARNING: No cpu shares support

Trixanna commented 4 years ago

All the ways did not work for me except this one:

1. open /etc/default/grub as admin

2. Append value of **GRUB_CMDLINE_LINUX** with `systemd.unified_cgroup_hierarchy=0`

3. `sudo -i`

4. `sudo grub2-mkconfig > /boot/efi/EFI/fedora/grub.cfg` or
   `sudo grub2-mkconfig > /boot/grub2/grub.cfg`

5. `sudo reboot now`

Omg ty! This one was hard to track down but this fixed mine too! :)

120dev commented 4 years ago

this save my day :) https://medium.com/@drpdishant/installing-docker-on-fedora-31-a073db823bb8

saif-ellafi commented 4 years ago

this save my day :) https://medium.com/@drpdishant/installing-docker-on-fedora-31-a073db823bb8

For me too! thanks.

Any side effects of this grub option?

saif-ellafi commented 4 years ago

systemd.unified_cgroup_hierarchy

    When specified without an argument or with a true argument, enables the usage of 
unified cgroup hierarchy (a.k.a. cgroups-v2). When specified with a false argument, 
fall back to hybrid or full legacy cgroup hierarchy.

    If this option is not specified, the default behaviour is determined during 
compilation (the -Ddefault-hierarchy= meson option). If the kernel does 
not support unified cgroup hierarchy, the legacy hierarchy will be used 
even if this option is specified.```
mvhirsch commented 4 years ago

Also see https://github.com/docker/cli/issues/297

jifalops commented 4 years ago

I'm trying to fix this in chromeos under a chroot (crouton). It doesn't have grub so the unified cgroup hierarchy isn't really an option. I've tried symlinking cpus and mems to cpuset.xxx but the mount's options prevent this and I haven't been able to remount cgroups successfully. Are there any known workarounds? It seems like I'm so close.

nolange commented 4 years ago
systemd.unified_cgroup_hierarchy

    When specified without an argument or with a true argument, enables the usage of 
unified cgroup hierarchy (a.k.a. cgroups-v2). When specified with a false argument, 
fall back to hybrid or full legacy cgroup hierarchy.

    If this option is not specified, the default behaviour is determined during 
compilation (the -Ddefault-hierarchy= meson option). If the kernel does 
not support unified cgroup hierarchy, the legacy hierarchy will be used 
even if this option is specified.```

Are you planning to support the unified_hierarchy some day?

AkihiroSuda commented 4 years ago

Already supported on master

stealthybox commented 4 years ago

@jifalops I was working on this from an android perspective. My understanding is the ChromeOS issue is similar.

I had to patch runc, containerd, dockerd, and the kubelet's cadvisor in order to support noprefix crgroup mounts.

I got everything working but haven't had the time to publish and push through the patches yet. You may have luck patching your own builds.

celesteking commented 4 years ago

Hooray for systemd breaking things without providing adequate fallback/legacy API.

madrisan commented 4 years ago

systemd supports both the versions of cgroup, so there's no break at all. It's up to the distribution maintainers to decide which version to choose by default. And as pointed out in this same thread, switching from the one version to the other one is just a question of adding a kernel parameter at boot.

celesteking commented 4 years ago

Yeah, it's like swapping a nuclear reactor engine. You have to shut it down first. Easy as it gets. That's no compatibility, that's called an offline swap.

madrisan commented 4 years ago

The compatibility (or a switch to the unified cgroup hierarchy) have to be implemented in Docker and is in progress. So let's give them the time they need of just switch back to the old implementation at boot time, if you use the only (afaiks) Linux distribution that has chosen to break Docker in order to push their alternative container engine.

nolange commented 4 years ago

Systemd uses cgroups, the newer cgroups V2 is the one not backwards compatible or fully functional alongside V1, its around since 2016. If you configure systemd to setup V2 then you have to live with consequences (apps incompatible with V2), if not you can pick V1 with a boot parameter.

So Linux has multiple Options, Systemd supports all of those, Docker doesnt (yet), you are overwhelmed with that. Blame Systemd all you want, but maybe do that in the right cult.

celesteking commented 4 years ago

Here we go with SJW (systemd justice warriors) defending dubious systemd choices. Color me surprised.

ghost commented 4 years ago

try this:-

cset shield --reset

lib314a commented 4 years ago

Now there is an official article from Fedora Magazine: Docker and Fedora 32 instructs how to solve the error

Richard87 commented 3 years ago

Hi guys! I just installed Docker 20.10 beta1, while it had some issues configuring the bridge, adding it manually (brctl addbr docker1) and setting it in daemon.json: {"bridge": "docker1"}, docker works with cgroups v2, at least with the hello-world image :D

satmandu commented 1 year ago

@jifalops I was working on this from an android perspective. My understanding is the ChromeOS issue is similar.

I had to patch runc, containerd, dockerd, and the kubelet's cadvisor in order to support noprefix crgroup mounts.

I got everything working but haven't had the time to publish and push through the patches yet. You may have luck patching your own builds.

Is there any info on how to fix this? I'm trying to get this working for ChromeOS & Chromebrew: https://github.com/chromebrew/chromebrew/pull/7828

satmandu commented 1 year ago

@sophy228 Was https://github.com/opencontainers/runc/pull/2090 sufficient for runc to work? @stealthybox Did you have any working patches you were willing to share?