docker service create doesn't allow --privileged flag

ghost commented 8 years ago

Output of docker version:

Client:
 Version:      1.12.0-rc4
 API version:  1.24
 Go version:   go1.6.2
 Git commit:   e4a0dbc
 Built:        Wed Jul 13 03:39:43 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.0-rc4
 API version:  1.24
 Go version:   go1.6.2
 Git commit:   e4a0dbc
 Built:        Wed Jul 13 03:39:43 2016
 OS/Arch:      linux/amd64

Output of docker info:

Containers: 1
 Running: 1
 Paused: 0
 Stopped: 0
Images: 54
Server Version: 1.12.0-rc4
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 71
 Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge null host overlay
Swarm: active
 NodeID: 33ops9juo9ea1twbfq2dyt89y
 IsManager: Yes
 Managers: 2
 Nodes: 5
 CACertHash: sha256:cef0da32ea05dd1038a5b8ae1a3a6956b6a5efa2d2fcad535a696dd568220197
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 3.13.0-86-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 12
Total Memory: 94.42 GiB
Name: irvm-ggallag
ID: WA3H:N54J:H7F3:CQV6:74ZX:IWIZ:U6XG:2VCB:45LP:LDD5:FHB6:7CWZ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Insecure Registries:
 127.0.0.0/8

Additional environment details (AWS, VirtualBox, physical, etc.): Ubuntu 14.04 VM under KVM running Docker enginer 1.12 RC4

Steps to reproduce the issue:

docker service create
inside docker image NFS mount

Describe the results you received: I can run "docker run --privileged" to allow an NFS mount from within my container, however there is no way to pass this --privileged flag to "docker service" and if I do not pass the --privileged flag, the contain would error internally on the mount like:

mount: permission denied

Describe the results you expected: I should be able to have my container mount an NFS server from within it. I do not want to do this externally or via a docker volume, for example, I am trying to drive a huge number of parallel containers running NFS mounts and I/O individually.

Additional information you deem important (e.g. issue happens only occasionally):

justincormack commented 8 years ago

I think there is a whole set of issues for these features on service create, we should probably make an issue listing them all.

I think the plan was to discuss what should be added, once 1.12 is released.

thaJeztah commented 8 years ago

Correct, I was planning to create a tracking issue for that

gerred commented 8 years ago

I could really use this for 1.12 as well. If this is an area I could jump in and issue a PR, I'm happy to get started on it.

thaJeztah commented 8 years ago

We need to decide first; services are not "containers", so not all options can be / should be copied to service create

gerred commented 8 years ago

@thaJeztah Another consideration - I have different needs between a replicated service and a global one. If a jobs service type is introduced, which has been discussed, those needs might be different too.

The global one I may expect to have more flags/options around, just given the nature of "other things" I might be doing with them (monitoring, networking, running containers, etc.). I suppose I could have a global service that mounts the docker socket that then runs a privileged container on each node, but that seems messy (now my the tasks in my global service are managing the lifecycle of a container on each engine separately).

Hopefully that helps with some of that discussion.

ghost commented 8 years ago

If services != containers, why do you pass an image name to the create command? Seems like you would rather pass something like a manifest (maybe exactly like the docker-compose.yml file ?).

For this issue, if it is a PR, I'll phrase it in user story format: As a user of swarm, I want to create services and containers which run under privileged mode. How do I do this?

I'm happy to help any way that I can!

gerred commented 8 years ago

@frellus You're right - stacks/DABs are really what I need but they're also really early and don't have a service type option associated with them yet. There's also some other little nits there I need to write a more specific issue around in compose. Ultimately it's all still a bit of a chicken and egg problem - in one I get privileged, in the other I get service types. :) It'll all shake out, for now just reporting my uses to help provide as much data as I can! :smile:

padyx commented 8 years ago

I am very interested in this, because as far as I know the Oracle DB cannot run in a container without either the --privileged option or specifying the --shm flag when running a container.

Given that they are both not supported yet in API 1.24 for services AFAICS, it would be impossible to replace a Docker Swarm (standalone) by Docker 1.12 Swarm to run such services.

Edit: Oracle just published Docker files https://github.com/oracle/docker-images/tree/master/OracleDatabase, so this big hurdle is resolved for us.

bboreham commented 8 years ago

Drive-by observation: you should do --cap-add before --privileged, to encourage people to be more granular in what they need. #25885 relates.

b0ch3nski commented 8 years ago

I'd really like to see it implemented soon - there are more solutions that require --privileged flag to function properly, e.g. cAdvisor which I'm using for containers performance monitoring.

acaranta commented 8 years ago

If I may add ... --privileged and/or --device* are quite critical for the case you need to run containers using GPU/CUDA calculations .... Placement rules to allow these kind of containers to run on specific hosts can be used ... but not being able to actually use the GPU ..... kind of renders the swarmmode useless for us ... :(

AkihiroSuda commented 8 years ago

Just FYI, linking the PR for supporting "device" to this issue: https://github.com/docker/swarmkit/issues/1244 https://github.com/docker/swarmkit/pull/1355

Even though there would be --device, I think --privileged is still attractive (e.g. for DinD)

frimdo commented 8 years ago

+1

--cap-add and --cap-drop Is a must. --privileged would be nice. Are there any plans implementing it?

fjammes commented 8 years ago

Missing --cap-add to use swarm mode in production, my management push me to move towards kubernetes if this option is not added soon. Do you have some plan and agenda for adding this feature please?

dalefwillis commented 8 years ago

+1

--cap-add would be a huge help!

seiferteric commented 8 years ago

Working on kind of a workaround. It will run your privileged app in a secondary container by mounting /var/run/docker.sock in your service and proxying tcp connections back to the service container with socat and unix sockets. Still needs some work though.

calh commented 7 years ago

I'd also like to see the --cap-add on docker services. I've written a workaround, similar to @seiferteric's if anyone would like to try it out: https://github.com/calh/docker_priv_proxy

I'm using signal traps, socat, and the docker socket to pair swarm mode service containers with a local privileged mode container. It seems to work well so far!

seiferteric commented 7 years ago

Just another use case, I want to run keepalived with vrrp on a swarm and it needs net=host and --cap-add=NET_ADMIN, so cap-add would be great.

TAGC commented 7 years ago

+1

I want to be able to run a dockerised web service on an RPi that interfaces with a microcontroller over USB and provides web-based access to it over a network. I'm blocked by this and dotnet/dotnet-docker#223.

AkihiroSuda commented 7 years ago

Folks, please avoid filling up this issue with +1 😭 You could click the 👍 button instead.

subfuzion commented 7 years ago

Here's a specific (admittedly edge) use case: I distribute DinD containers around a cluster of host nodes (the DinD containers allow us to run mini-isolated test swarms). If I were a able to use privileged services on an outer host swarm, I could take advantage of swarm for the automatic distribution of these workloads (and named access) without having to manage these details manually.

Toshik commented 7 years ago

Any news on --device for services?

cpuguy83 commented 7 years ago

@Toshik #33440, it's not complete in terms of actually allocating the device, but putting the right API's in place.

thaJeztah commented 7 years ago

Please don't +1 this issue, it's not helpful. If you have specific requirements to need this feature, please describe the use case, which is more helpful.

At this point it is not likely that --privileged will be added as-is to docker service, as this option nullifies all security that containers provide (TL;DR; running a service/container with --privileged effectively gives root access to the host).

Instead, more fine-grained permissions are needed; while --cap-add / --cap-drop is an option, it doesn't solve everything; for example, security profiles (SecComp, SELinux, AppArmor) are also related. Setting the right combination of options is complex (and cumbersome), which is why often people "just set --privileged" to make things work (but doing so, running highly insecure - see above).

Security should be usable, therefore a proposal is opened to work towards a design that both allows people to give services the privileges they need in a more fine-grained approach, without introducing the complexity of setting all options manually.

The proposal can be found here: Proposal: Entitlements in Moby, and the POC https://github.com/docker/libentitlement repository, also there's a Google document describing the proposal in more detail.

We welcome feedback on the proposal; either by commenting on that issue, or taking part in the Orchestration Security SIG meetups (see https://forums.mobyproject.org/c/sig/orchestration-security)

@2stacks @blop @dalefwillis @danmanners @dfresh613 @dl00 @Farfad @galindro @getvivekv @gklbiti @joaquin386 @Kent-H @matansag @MingShyanWei @pinkal-vansia @realcbb @siavash9000 @soar @spfannemueller @TAGC @vingrad @voron3x @wangsquirrel

I removed your +1 comments, as they don't add useful input to the discussion; if you don't have a particular use-case / additional information that can help getting the design right, use the :+1: emoji in the top comment to let others know you're interested in this (see @AkihiroSuda's comment https://github.com/moby/moby/issues/24862#issuecomment-296577597). If you commented just to get notified on updates; use the subscribe button on the right side of this page.

wannabesrevenge commented 7 years ago

This would really be helpful for getting more detailed cluster info. In my use case, I need to get info on memory, Infiniband devices, GPUs, network health, PCI topology, and a some other things. We need to periodically recheck this info. We use Swarm for our cluster level scheduling, and being able to run something like this as a global service so that all nodes could periodically report back info about themselves would be a huge benefit.

ATM we are limited to what docker tells us about the system. If we want any more detailed info, we have to go on the box and run the service ourselves or get a second scheduler.

elliotwoods commented 7 years ago

Use case : Distributed embedded environment with multiple nodes (>100) running headless in the wild. Need to be able to remotely update applications / environments. The embedded systems operate external electrical devices (e.g. smart home / industrial automation). Example devices:

USB to serial (either the USB or the TTY device could be shared)
UARTs (TTY)
GPIO pins
I2C / SPI
On-board LED's

Current plan is to run without docker service (i.e. 'manual swarm') until a workable solution exists.

chestnutt commented 7 years ago

Another use case: Using a Container-as-a-service environment (so no ability to install anything on the host). We want to deploy an image (--mode=global) that can report back the stats we care about on all other docker containers running on the same host, as well as on the host itself. (Similar to @wannabesrevenge's case )

sgohl commented 7 years ago

I mainly need this for bootstrapping/provisioning nodes, gathering hardware relevant data, setting ips for the host... Currently I have a service running which then runs a script which does docker run --privileged. That's ridiculous.

marcellodesales commented 7 years ago

I'm in desperate needs for this feature to run a mode: global in swarm using docker-compose v3 for monitoring on the host level...

My current workaround is to drop the docker-compose.yml and use compose v2 instead and execute it with privileged: true.

goalotc commented 7 years ago

To deploy containers such as sysdig. Seccomp, being proposed by Docker but not supported in swarm service is unreasonable.

ab77 commented 7 years ago

IPSec VPN with strongSwan/Openswan is another use-case for privileged support.

TAGC commented 7 years ago

I have an ASP.NET Core web service that I've dockerised. Part of the operation of this web service involves communicating with a microcontroller over USB. I can run it on my local Windows 7 machine using Docker Toolbox by passing the USB device through to Virtualbox (so the docker VM has access to it).

On Windows 10 deployment machines it would be preferable to use Docker For Windows, which runs Docker on top of Hyper-V. However, Hyper-V apparently doesn't support USB pass-through like Virtualbox does. One idea I had to work around this would be to split my webservice into two separate ones; one that handles all the USB communication logic and the other handling everything else. That would mean I could run the former (the "device comms" service) on a Raspberry Pi and the latter would be free to run anywhere, even in a non-privileged container. The idea would be to have a Docker swarm with these two services linked to each other.

However, how can I create a "device comms" service which requires privileged access to the filesystem if --privileged is not supported? How do I make any sort of Dockerised application that needs to perform USB communication on a Windows 10 machine?

This issue has been open for 14 months and it doesn't seem like any progress is being made.

man4j commented 7 years ago

I want to deploy GlusterFS in swarm, but glusterfs dont work without --privileged flag ((( Please help!

purplesrl commented 7 years ago

@thaJeztah

The proposal looks great, but that looks like a complete change of the way privileges work, which will take a loooooong time.

Meanwhile docker allows you to run --privileged, or to use --cap-add for containers, so why not provide the same facility for services and let the users decide if they want to run containers securely or insecurely until that proposal is implemented. This step would be simple I think, it is just a matter of passing variables when creating the containers on the other hosts.

knick-burns commented 7 years ago

It surprises me that security is thrown around as the reason why this has not been implemented, yet there is no inherent way to restrict egress traffic other than an external device, or by manipulating the ip tables.

I don't really need you to be concerned with how I run my applications.

TAGC commented 7 years ago

@purplesrl and @knick-burns sum it up.

At this point it is not likely that --privileged will be added as-is to docker service, as this option nullifies all security that containers provide

I don't care about the security that containers provide. So what if a Docker container has unrestricted privileges when the host it's running on is a £30 throw-away Raspberry Pi? Why not just permit people to use --privileged and if they shoot themselves in the foot with it, that's on them?

What I actually do want is the service discovery, self-healing, self-containment and modularity you get from using Docker swarm.

edwardofclt commented 7 years ago

I agree this would be a great help. Mouting NFS is impossible without hacking the stuff apart.

cpuguy83 commented 7 years ago

@edward-of-cit you can always use a volume to mount nfs.

edwardofclt commented 7 years ago

@cpuguy83 When you can point me to documentation that works for docker service and docker stack, I'll believe you. I have spent the last two days looking for NFS mount documentation and nothing I have found works.

cpuguy83 commented 7 years ago

@edward-of-clt In addition to the built-in support for low-level NFS options, there are a number of volume drivers out there as well.

For the built-in support:

services:
    foo:
        mounts:
            - type: volume
              source: mynfs
              target: /data

volumes:
    mynfs:
        driver_opts:
            type: "nfs"
            o: addr=<nfs server addr>
            device: :/path/to/share

edwardofclt commented 7 years ago

See, I tried that. It's not working. Nothing ends up being mounted at the mount point.

And this is what I always get:

mounts Additional property mounts is not allowed

cpuguy83 commented 7 years ago

Probably need to set the "version" at the top of the stack file to 3.1 or 3.2... Can't remember what version it came out in. 3.4 is what's included with 17.09.

edwardofclt commented 7 years ago

That's not working either.

cpuguy83 commented 7 years ago

Happy to help on the community slack.

edwardofclt commented 7 years ago

I'm on there, but can't see anything but the messages that have been sent to me.

purplesrl commented 7 years ago

@edward-of-clt

From what I have researched you can use this:

https://github.com/mavenugo/swarm-exec

To execute a command across the swarm with privileged flag. There is also a tutorial for what you need here:

https://www.vip-consult.solutions/post/persistent-storage-docker-swarm-nfs

cpuguy83 commented 7 years ago

We talked offline. The linked items are really not what you want for mounting NFS... but in the interest of keeping this issue from diverging if anyone would like to discuss this, feel free to ping me on slack.

efocht commented 7 years ago

Our use case is that we need to assign fixed IPs to services over macvlan. We could re-assign the IPs from inside the container if we had --privileged or proper --cap-add. Nicer would be --ip option passed to the service definition similar to the way it is done with docker run.

The problem with macvlan is that it is unusable, when starting two services on two different worker nodes, we get the same IP for both.

blop commented 7 years ago

@efocht actually, you need to split the subnet of the swarmed macvlan for each worker its seems that the ip allocation for macvlan networks are not distributed at the moment.

I do something like (splitting /16 into sub /24 per worker) :

# on worker 1
docker network create --config-only --subnet 10.140.0.0/16 --gateway 10.140.0.1 -o parent=ens160 --ip-range 10.140.1.0/24 local-network-name
# on worker 2
docker network create --config-only --subnet 10.140.0.0/16 --gateway 10.140.0.1 -o parent=ens160 --ip-range 10.140.2.0/24 local-network-name
# on worker 3
docker network create --config-only --subnet 10.140.0.0/16 --gateway 10.140.0.1 -o parent=ens160 --ip-range 10.140.3.0/24 local-network-name

# on manager
docker network create -d macvlan --scope swarm --attachable --config-from local-network-name swarm-network-name

efocht commented 7 years ago

The setup above won't allow the failover a service with a particular IP. For example a user-space NFS server. I need service1 to have ip1, service2: ip2. And keep the IPs when the services are failed over.

moby / moby

docker service create doesn't allow --privileged flag #24862