hkjn commented 8 years ago

Hi OCI folks,

We are seeing a failure to start Docker containers through runc, seemingly from this line:

https://github.com/opencontainers/runc/blob/master/libcontainer/process_linux.go#L245

This might well be a config or system issue (we're on somewhat old Kernel versions because CentOS..), but the logs don't give so much to go on here..

The man pages for setns is defining the error codes it should return:

http://man7.org/linux/man-pages/man2/setns.2.html

But if the following page can be trusted, exit status 6 should be ENXIO, which is not mentioned in the man pages:

http://www.virtsync.com/c-error-codes-include-errno

Any suggestions for how to debug further or what to check would be appreciated, thanks in advance!

Logs

/bin/docker: Error response from daemon: invalid header field value "oci runtime error: container_linux.go:247: starting container process caused \"process_linux.go:245: running exec setns process for init caused \\\"exit status 6\\\"\"\n".

System info

# uname -a
Linux ip-10-226-24-78 3.10.0-327.28.2.el7.x86_64 #1 SMP Wed Aug 3 11:11:39 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

# cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

# docker info
Containers: 1
 Running: 1
  Paused: 0
     Stopped: 0
     Images: 6
     Server Version: 1.12.2
     Storage Driver: overlay
      Backing Filesystem: xfs
        Logging Driver: json-file
        Cgroup Driver: cgroupfs
        Plugins:
         Volume: local
          Network: bridge null host overlay
            Swarm: inactive
            Runtimes: runc
            Default Runtime: runc
            Security Options: seccomp
            Kernel Version: 3.10.0-327.28.2.el7.x86_64
            Operating System: CentOS Linux 7 (Core)
            OSType: linux
            Architecture: x86_64
            CPUs: 2
            Total Memory: 7.389 GiB
            Name: ip-10-226-24-78
            ID: TNS5:V674:K6Y4:CSIT:ROPR:XJMI:LDSR:KTC3:DZS7:G7RD:426H:DFRN
            Docker Root Dir: /var/lib/docker
            Debug Mode (client): false
            Debug Mode (server): false
            Registry: https://index.docker.io/v1/
            WARNING: bridge-nf-call-iptables is disabled
            WARNING: bridge-nf-call-ip6tables is disabled
            Insecure Registries:
             127.0.0.0/8

# free -m
              total        used        free      shared  buff/cache   available
Mem:           7566         207         453           5        6904        4230
Swap:          2047         463        1584

# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Core(s) per socket:    2
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 63
Model name:            Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz
Stepping:              2
CPU MHz:               2400.082
BogoMIPS:              4800.16
Hypervisor vendor:     Xen
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              30720K
NUMA node0 CPU(s):     0,1

cyphar commented 8 years ago

The exit status 6 is a pretty ugly hack I added that allows us to figure out where inside this file your code is failing. An error from process_linux.go with "exit status 6" means that the 6th bail in that file was executed (in the version of runC you're running).

To cut a long story short, this is the code that is failing:

    /*
     * We must fork to actually enter the PID namespace, and use
     * CLONE_PARENT so that the child init can have the right parent
     * (the bootstrap process). Also so we don't need to forward the
     * child's exit code or resend its death signal.
     */
    childpid = clone_parent(env, config->cloneflags);
    if (childpid < 0)
        bail("unable to fork"); /* this is where exit status 6 comes from */

So, the big question is -- does your system support all of the namespaces that you're trying to use? What is the output of ls -la /proc/self/ns?

hkjn commented 8 years ago

Ah, that helps explain the exit status, cheers.

What's odd here is that the failure was not consistent; sometimes the docker run command would work fine if we ran it manually, even if it failed with systemd, later it seemed to be failing with this symptom consistently.

The node degraded further and won't even let me ssh in now, so it's unfortunately hard to get more diagnostics from it.. another node which should be identically configured is giving the following output:

# ls -la /proc/self/ns
total 0
dr-x--x--x. 2 root root 0 Oct 20 09:13 .
dr-xr-xr-x. 9 root root 0 Oct 20 09:13 ..
lrwxrwxrwx. 1 root root 0 Oct 20 09:13 ipc -> ipc:[4026531839]
lrwxrwxrwx. 1 root root 0 Oct 20 09:13 mnt -> mnt:[4026531840]
lrwxrwxrwx. 1 root root 0 Oct 20 09:13 net -> net:[4026532028]
lrwxrwxrwx. 1 root root 0 Oct 20 09:13 pid -> pid:[4026531836]
lrwxrwxrwx. 1 root root 0 Oct 20 09:13 user -> user:[4026531837]
lrwxrwxrwx. 1 root root 0 Oct 20 09:13 uts -> uts:[4026531838]

But that node does not seem to hit the same issue as the first one; all services seem to have their containers start up fine.

I'll attach the info from /proc/self/ns from a node with this issue if it pops up again, feel free to close this bug or leave it open for others to chip in if they also hit the same symptom (couldn't find anything on Google by searching for the symptoms myself), your call.

cyphar commented 8 years ago

@hkjn Actually, the best thing would be for you to attach an strace -f of runc when the issue occurs. Though, since you're using Docker this might prove difficult (and it will have very large performance effects that aren't favourable). If you can reproduce having a node like that again, please try running any runC container set up (without Docker) on that machine with strace -f runc run ... to see what breaks. Thanks.

rajasec commented 8 years ago

@cyphar When I run nested runc ( runc inside runc), I'm getting the below error nsenter: unable to fork: Operation not permitted container_linux.go:247: starting container process caused "process_linux.go:245: running exec setns process for init caused \"exit status 6\"" May not be the right use case, thought of testing it out.

cyphar commented 8 years ago

@rajasec That's because you're trying to unshare namespaces you don't have the right to unshare. You'll have to take a look at the kernel code to figure out precisely what's happening (if you're trying to run runc from inside a chroot it's not going to work, for example).

jaredbroad commented 7 years ago

+1 have this error and don't use any runC for anything (though it might be used inside Mono). It also happens intermittently but mostly when the machine is tight on resources / overloaded.

Any other tips for debugging root cause if Im not using RunC?

jamiethermo commented 7 years ago

I have this error with docker (I assume docker-runc?). Not sure how I would debug it. Give me something to type and I'll type it?

cyphar commented 7 years ago

Some information that would be useful from anyone else who comments on this issue:

Are you running Docker with user namespaces enabled?
Is SELinux enabled on your host and/or container?
Can you use runc by itself -- outside of Docker? Read the README for information on how to start up a simple container.
What kernel version / distribution are you using?

jamiethermo commented 7 years ago

No user namespaces. SELinux is enabled & permissive Don't have "runc". I have "docker-runc" which says its 1.0.0-rc2. Is that runc? Centos 7.2: 3.10.0-327.36.2.el7.x86_64 #1 SMP Mon Oct 10 23:08:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

I'll have to tool around with it. I don't get a container following the runc readme. Doing something daft I expect.

cyphar commented 7 years ago

@jamiethermo docker-runc is just what Docker calls it's packaged version of runc.

You can create a container like this:

% mkdir -p bundle/rootfs
% docker create --name a_new_rootfs alpine:latest true
% docker export a_new_rootfs | tar xvfC - bundle/rootfs
% runc spec -b bundle
% runc run -b bundle container
/ # # This is inside the container now.

Does that help?

jamiethermo commented 7 years ago

Ok. That works.

cyphar commented 7 years ago

Alright, it would help to know what config.json the container is being started with (under Docker). Unfortunately Docker won't save the config.json if the container creation fails. You could try doing something like this:

% cat >/tmp/dodgy-runtime.sh <<EOF
#!/bin/sh

cat config.json >>/tmp/dodgy-runtime.log
exit 1
EOF
% chmod +x /tmp/dodgy-runtime.sh
% docker daemon --add-runtime="dodgy=/tmp/dodgy-runtime.sh" --default-runtime=dodgy

Then try to start a container. It will fail, but you should be able to get the config.json from /tmp/dodgy-runtime.log. You can then modify it so that the rootfs entry is equal to the string "rootfs" and then replace bundle/config.json in my previous comment with the old file.

Then runC should fail to start. Paste the config you got here.

jamiethermo commented 7 years ago

Ok. Can't do that right now. But since it seems arbitrary what is running and what is failing (the same docker image will run one minute and not the next), here's a config file that did get created. Don't know if that'll help. Will try the hack, above, tomorrow. Thanks! config.json.zip

hqhq commented 7 years ago

For people who get "exit status x"，you can get the runc code you are using, then:

# cd libcontainer/nsenter
# gcc -E nsexec.c -o nsexec.i

Then you can find out which bail you hit from nsexec.i.

It's ugly though, we should improve it someday.

cyphar commented 7 years ago

@hqhq Or you can count from the start of the file (which is what I do). Vim even has a shortcut for it. But yes, the bail(...) code was a hack to get around the fact that we aren't writing our errors to the error pipe in nsexec -- the only information we get is the return code. :P

jamiethermo commented 7 years ago

@cyphar Could I replace docker-runc with a bash script that saves off the config.json somewhere if it crashes? Could we make runc do that by default?

cyphar commented 7 years ago

Could I replace docker-runc with a bash script that saves off the config.json somewhere if it crashes?

You could try that. By the way, if you haven't created an upstream bug report (in Docker) please do so.

Could we make runc do that by default?

I don't want to, mainly because it'd only be helpful for debugging things in certain cases under Docker. And runC is not just used inside Docker.

jamesongithub commented 7 years ago

ECS team thinks this issue is causing their agent to disconnect at times. Referenced https://github.com/aws/amazon-ecs-agent/issues/658#issuecomment-271752302

jaredbroad commented 7 years ago

I "fixed" by upgrading from Ubuntu 15.04 -> 16.04. It might be a bug in an old version that is no longer maintained.

On Wed, Feb 1, 2017 at 6:24 PM, James Yang notifications@github.com wrote:

ECS team thinks this issue is causing their agent to disconnect at times. Referenced aws/amazon-ecs-agent#658 (comment) https://github.com/aws/amazon-ecs-agent/issues/658#issuecomment-271752302

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/opencontainers/runc/issues/1130#issuecomment-276816486, or mute the thread https://github.com/notifications/unsubscribe-auth/ACI6mXS1bSYH35c_Dv020e6jfsrqfnrEks5rYRRLgaJpZM4Kbwol .

-- Jared Broad

jamesongithub commented 7 years ago

hm might have to try that

jamesongithub commented 7 years ago

@cyphar is there a workaround for this? besides upgrading to ubuntu 16?

cyphar commented 7 years ago

@jamesongithub It's likely that issues of this form are kernel issues (and since Ubuntu has interesting kernel policies, upgrading might be your only option), unless you have some very odd configurations. As I mentioned above, the error only tells us what line inside libcontainer/nsenter/nsexec.c failed (and unshare can fail for a wide variety of reasons).

freefood89 commented 7 years ago

I've been having this issue with RHEL 7.3 too SELINUX=enforcing SELINUXTYPE=targeted

Besides being inexperienced with stuff like ns and runc, I'm struggling to figure out what's going on because it's intermittent as mentioned by @jamesongithub

ls -la /proc/self/ns shows the same results as @hkjn

frezbo commented 6 years ago

@cyphar @rhatdan Same issue on RHEL 7.4, but exit status is 40, user namespace is enabled as per this doc: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_atomic_host/7/html/getting_started_with_containers/get_started_with_docker_formatted_container_images#user_namespaces_options.

On latest available kernel.

frezbo commented 6 years ago

For anyone having issues with RHEL only enable this option: namespace.unpriv_enable=1 and not this user_namespace.enable=1 having both in cmdline causes issues:

[ec2-user@ip-10-16-1-55 mycontainer]$ cat /proc/cmdline | grep "namespace.unpriv_enable=1"
BOOT_IMAGE=/boot/vmlinuz-3.10.0-693.11.6.el7.x86_64 root=UUID=de4def96-ff72-4eb9-ad5e-0847257d1866 ro console=ttyS0,115200n8 console=tty0 net.ifnames=0 crashkernel=auto LANG=en_US.UTF-8 namespace.unpriv_enable=1
[ec2-user@ip-10-16-1-55 mycontainer]$ runc --root /tmp/runc run --no-pivot --no-new-keyring mycontainerid
/ #

chadfurman commented 6 years ago

I came here from google for a similar error. Turns out, I was trying to use the VOLUME directive in my dockerfile like this:

VOLUME . /src thinking I could mount the current directory from the host as a volume like that, but that's not how it works.

You have to, instead, do this:

VOLUME /src followed by docker run -v /absolute/path/to/directory/on/host:/src <rest of your docker run command>

Note also (and somewhat unrelated) that I was getting similar errors on Fedora simply related to SELinux. And while I don't recommend doing the following for security reasons (see: http://stopdisablingselinux.com/), it did work for me:

sudo setenforce 0
sudo systemctl restart docker
docker build -t image .
docker run image

smileusd commented 6 years ago

I meet the same problem, when I build and start a image.

Sending build context to Docker daemon   220 MB
Step 1 : FROM warpdrive:tos-release-1-5
 ---> 769306738d96
Step 2 : COPY . /go/src/github.com/transwarp/warpdrive/
 ---> 07c99697b16e
Removing intermediate container 127c0e71a84b
Successfully built 07c99697b16e
/usr/bin/docker: Error response from daemon: invalid header field value "oci runtime error: container_linux.go:247: starting container process caused \"process_linux.go:245: running exec setns process for init caused \\\"exit status 6\\\"\"\n".
FATA[0301] exit status 125                              
make: *** [build] Error 1

Then I clean the a lot of images and containers and free the caches, the problem is disappear. But I think is not a cache problem because of the change of cache is tiny.

meirwah commented 5 years ago

yipingxx commented 5 years ago

It is bug of kernel(3.10.0-327),try to update your kernel version.

opencontainers / runc

starting container process caused 'process_linux.go:245: running exec setns process for init caused "exit status 6"' #1130

Logs

System info