microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
17.32k stars 814 forks source link

Call sched_setscheduler function in docker container, get EPERM error #10167

Open SchwarzeMagie opened 1 year ago

SchwarzeMagie commented 1 year ago

Windows Version

Microsoft Windows [Version 10.0.22621.1778]

WSL Version

1.2.5.0

Are you using WSL 1 or WSL 2?

Kernel Version

5.15.90.1

Distro Version

Ubuntu 20.04.6 LTS

Other Software

Docker version 24.0.2, build cb74dfc

Repro Steps

My docker image base ubuntu:bionic-20210723,this is my development container.

Docker run command:

docker run --name $NAME \
           --security-opt seccomp=unconfined \
           --cap-add sys_nice \
           --privileged

My test code:

#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <sched.h>

int main()
{
    struct sched_param sched;
    sched.sched_priority = 1;

    if (sched_setscheduler(0, SCHED_RR, &sched) < 0) {
        fprintf(stderr, "sched_setscheduler failed: err(%d) = %s\n", errno, strerror(errno));
    } else {
        printf("sched_priority set to %d\n", sched.sched_priority);
    }

    return 0;
}

Expected Behavior

sched_setscheduler successfully called,my test code can run successfully in ubuntu docker in virtualbox, and it can also run successfully in wsl2 host.

Actual Behavior

get EPERM error in wsl2 docker.

Diagnostic Logs

No response

numo68 commented 1 year ago

Same here with slightly different functions called but same area (pthread_attr_setschedpolicy, pthread_attr_setschedparam, then creating a thread with those parameters that fails with EPERM).

Tried both --privileged and --cap-add=SYS_NICE, no change. capsh --print run inside the container confirms the capability is present. Different distro (Debian bullseye), different docker (20.10.5+dfsg1), same kernel.

numo68 commented 1 year ago

fwiw, mounting the cgroups 2 at /sys/fs/cgroup fixed the problem for me. After that --cap-add=SYS_NICE is all that is needed

The /etc/fstab line:

cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0
SchwarzeMagie commented 1 year ago

fwiw, mounting the cgroups 2 at /sys/fs/cgroup fixed the problem for me. After that --cap-add=SYS_NICE is all that is needed

The /etc/fstab line:

cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0

It doesn't works for me😢,sched_setscheduler still return EPERM.

vdavitiani commented 1 year ago

same issue here, mounting cgroups v2 didn't solve the issue. Rerpoduced on Win11 (22621.1848) and with following .wslconfig:

[wsl2] nestedVirtualization=true kernelCommandLine = cgroup_no_v1=all

vdavitiani commented 1 year ago

I had to revert to default cgroups behavior (v1 with v2 in Ubuntu 20.04), run container with --cpu-rt-runtime=950000 and --cap-add=SYS_NICE. Which resulted in:

failed to write 95000 to cpu.rt_runtime_us: write /sys/fs/cgroup/cpu,cpuacct/system.slice/.../cpu.rt_runtime_us: invalid argument

which was fixed by:

sudo sh -c 'echo 950000 > /sys/fs/cgroup/cpu/docker/cpu.rt_runtime_us'

or to make changes permanent add --cpu-rt-runtime=950000 to dockerd args (e.g. by editing docker.service)

for details see https://stackoverflow.com/questions/46563332/docker-daemon-container-real-time-scheduling-with-ubuntu-linux-host/47999752

g-ulli commented 11 months ago

Same problem here. Can be reproduced with

docker run --rm -it \
    --ulimit rtprio=99 --privileged --security-opt seccomp=unconfined \
    --cap-add=sys_nice --userns host ubuntu:20.04 bash -c "chrt -f -p 50 1"

Same command works in WSL Ubuntu after modifying rtprio ulimits.

WSL version:

WSL version: 1.2.5.0
Kernel version: 5.15.90.1
WSLg version: 1.0.51
MSRDC version: 1.2.3770
Direct3D version: 1.608.2-61064218
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.19045.3570