error when forking: Operation not permitted (os error 1)

de-code commented 6 years ago

Not sure if that is related to #156

I'm getting error error when forking: Operation not permitted (os error 1).

Steps to reproduce:

Using Dockerfile for vagga:

FROM ubuntu:18.04

RUN apt-get update \
    && apt-get install -y ca-certificates

RUN echo 'deb [arch=amd64 trusted=yes] https://ubuntu.zerogw.com vagga main' \
    | tee /etc/apt/sources.list.d/vagga.list \
    && apt-get update \
    && apt-get install -y vagga

Build vagga image:

docker build -t dummy/vagga:develop .

Save sample image (busybox.tar.gz):

docker save busybox | gzip > busybox.tar.gz

vagga.yaml:

containers:
  busybox:
    setup:
    - !Tar
      url: /opt/busybox.tar.gz
commands:
  ls: !Command
    container: busybox
    run: ls

Run:

docker run --rm \
    --volume "$(pwd)/busybox.tar.gz:/opt/busybox.tar.gz" \
    --volume "$(pwd)/vagga.yaml:/opt/vagga.yaml" \
    --workdir "/opt" \
    dummy/vagga:develop \
    vagga ls

Error:

Command <Command "/proc/self/exe" "__wrapper__" "_build" "busybox"; environ[3]; uid_map=[UidMap { inside_uid: 0, outside_uid:0, count: 1 }, UidMap { inside_uid: 1, outside_uid: 1, count: 65535 }]; gid_map=[GidMap { inside_gid: 0, outside_gid: 0, count: 1 }, GidMap { inside_gid: 1, outside_gid: 1, count: 65535 }]>: error when forking: Operation not permitted (os error 1)

tailhook commented 6 years ago

Hm.. can you try tailhook/vagga image? This one is tested. This allows us to find out if this is a problem with the docker image, or some kind of kernel setting.

Also take a look at sysctl kernel.unprivileged_userns_clone and CONFIG_USER_NS (full instructions are here)

de-code commented 6 years ago

I'm getting the same error with the official image. It doesn't seem to be getting to reading the referenced busybox image as it doesn't make a difference whether it's there or not.

On the host and vagga container: kernel.unprivileged_userns_clone = 1

Not sure about CONFIG_USER_NS. I'm on ubuntu.

I'm not entirely sure whether that section applies to the host or the container where vagga is running. I am assuming the latter. (I also won't have much control over the host where I want to run it)

In the vagga container I am not getting anything in /etc/subgid or /etc/subuid (the files are empty).

I noticed I didn't include the vagga.yaml in my issue. I've updated the description.

tailhook commented 6 years ago

I'm not entirely sure whether that section applies to the host or the container where vagga is running

To the host. But if it's in container is 1 then it's okay (the value on the host is the same)

In the vagga container I am not getting anything in /etc/subgid or /etc/subuid (the files are empty).

That's fine as long as user is root.

There is two probable reasons of "Permissions denied" in your case: either apparmor disables that (or selinux, but apparmor is more probable on ubuntu) or docker has some limitations (either in the new version, or because it configured somehow specifically).

Which version of docker do you have?
Can you look at apparmor log and see whether it's blocked something? (it should be in /var/log/messages by default, as far as google tells me)

Another question is why do you run vagga in docker?

de-code commented 6 years ago

There is two probable reasons of "Permissions denied" in your case: either apparmor disables that (or selinux, but apparmor is more probable on ubuntu) or docker has some limitations (either in the new version, or because it configured somehow specifically).

Which version of docker do you have?

Docker version 18.06.0-ce, build 0ffa825

Can you look at apparmor log and see whether it's blocked something? (it should be in /var/log/messages by default, as far as google tells me)

I can see that the docs (linked from the main) also mention those files but neither /var/log/messages nor /var/log/audit/audit.log` exist. dmesg doesn't seem to show anything suspicious but including it anyway:

[21904.261460] docker0: port 1(veth807c65a) entered blocking state
[21904.261469] docker0: port 1(veth807c65a) entered disabled state
[21904.262194] device veth807c65a entered promiscuous mode
[21904.264734] IPv6: ADDRCONF(NETDEV_UP): veth807c65a: link is not ready
[21904.501072] eth0: renamed from veth54408d8
[21904.525625] IPv6: ADDRCONF(NETDEV_CHANGE): veth807c65a: link becomes ready
[21904.525794] docker0: port 1(veth807c65a) entered blocking state
[21904.525802] docker0: port 1(veth807c65a) entered forwarding state
[21904.970908] docker0: port 1(veth807c65a) entered disabled state
[21904.971315] veth54408d8: renamed from eth0
[21905.027027] docker0: port 1(veth807c65a) entered disabled state
[21905.031798] device veth807c65a left promiscuous mode
[21905.031809] docker0: port 1(veth807c65a) entered disabled state

After restarting apparmor dmesg was also showing some audit logging from apparmor. I also tried to stop apparmor with the same outcome.

Another question is why do you run vagga in docker?

I was hoping that I could use vagga to run a docker container in a serverless environment like Google's Dataflow or Pipeline API which run unprivileged docker containers. The Pipeline API allows me to run a my own docker container but I want to run an existing third party container within the container to perform the actual work. Dataflow doesn't really allow me to specify the container and dependencies need to be installed (via the package manager or pip). With the Pipeline API I could also extend the third party container but I have a number of different ones, which gets messy and is still limiting (as I couldn't run two containers that way). The alternative is for me to run a separate cluster with the desired containers but then I have one set of workers depend on another set of workers and I loose some of the serverless benefits. (I only need the third party containers for a limited time, i.e. not 24/7)

mkpankov commented 5 years ago

Just hit this one, too.

mkpankov commented 5 years ago

It appears the issue might be https://github.com/google/gvisor/issues/144#issuecomment-476287995

Linux forbids creation of user namespaces while in a chroot. The man pages for both clone(2) and unshare(2) use the same wording:

    EPERM (since Linux 3.9)
          CLONE_NEWUSER was specified in flags and the caller is in a
          chroot environment (i.e., the caller's root directory does not
          match the root directory of the mount namespace in which it
          resides).

Some people recommend using pivot_root instead of chroot, is there a way I can hack that in vagga to test?

tailhook commented 5 years ago

@mkpankov, vagga uses pivot root everywhere exactly for this reason. So the issue is somewhere else.

tailhook commented 5 years ago

And well, it works other way around, you need external container to use pivot_root instead of chroot, not vagga itself. So it might be docker uses chroot (although, I think it shouldn't).

tailhook commented 5 years ago

If you have some time to play with it you can try the folowing things:

Running unshare with different flags (probably user + mount should fail)
Mount another tmpfs filesystem, pivot_root (command-line tool from util-linux) there, and then run vagga there (i.e. the mount root would be owned by the current user, it helps sometimes, unfortunately, I don't remember exact restrictions)

tailhook / vagga

error when forking: Operation not permitted (os error 1) #503