hashicorp / docker-consul

Official Docker images for Consul.
Mozilla Public License 2.0
398 stars 238 forks source link

Docker socket permissions for health checks? #50

Open dmaze opened 7 years ago

dmaze commented 7 years ago

I'm currently using a locally-built Consul image, that runs as root, and that includes a couple of Docker health checks. I'm trying to migrate to using the official image, and running into permission issues.

Say I run Consul as

docker run --name consul \
  --net host \
  -v $PWD/consul:/consul/data:Z \
  -v $PWD/etc/consul:/consul/config:Z \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -e CONSUL_ALLOW_PRIVILEGED_PORTS=true \
  consul:0.7.2 agent

where the health checks, recursor settings, etc. are in the $PWD/etc/consul directory.

If the Docker socket on the host (/var/run/docker.sock) is world-writeable, this works fine. If I'm on an Ubuntu 16.04 host, where the socket is mode 0660 owned by user 0 group 16, the Docker health checks silently fail, since this container (via gosu) runs its process as user 1000 group 1000.

Are there best practices for giving Consul permission to docker exec? I'm not comfortable with the broader implications of making the Docker socket world-writeable or opening a TCP version of it, I only want to give permission to Consul.

dweomer commented 7 years ago

Have you tried giving your consul container access to the socket via group membership? E.g.

docker run --name consul \
  --net host \
  --group-add $(stat -f '%g' /var/run/docker/sock) \
  -v $PWD/consul:/consul/data:Z \
  -v $PWD/etc/consul:/consul/config:Z \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -e CONSUL_ALLOW_PRIVILEGED_PORTS=true \
  consul:0.7.2 agent

Notice the --group-add flag.

dmaze commented 7 years ago

--group-add $(stat -c '%g' /var/run/docker.sock)

It doesn't actually help; and the reason that it doesn't is that gosu resets the supplementary group list (https://github.com/tianon/gosu/blob/master/setup-user.go#L35). I can verify this by using docker exec ... sh to get a shell in the container, and reading /proc/7/status (where 7 is the pid of the consul agent process).

/ # cat /proc/self/status
Name:   cat
...
Uid:    0   0   0   0
Gid:    0   0   0   0
FDSize: 64
Groups: 0 1 2 3 4 6 10 11 20 26 27 116 
/ # cat /proc/7/status
Name:   consul
...
Uid:    100 100 100 100
Gid:    1000    1000    1000    1000
FDSize: 64
Groups: 1000 
dweomer commented 7 years ago

Ah, shoot. I had forgotten about the drop-down via gosu. I wonder if a patch to the entrypoint script could carry that forward?

dmaze commented 7 years ago

I feel like the easiest change would look like

if [ -S /var/run/docker.sock ]; then
  GID=$(stat -c %g /var/run/docker.sock)
  if ! getent group $GID >/dev/null; then
    addgroup -g $GID docker
  fi
  adduser consul $(getent group $GID | sed 's/:.*//')
fi

immediately before the set -- gosu consul "$@". I don't really love that, though. It'd work, I think, provided the socket was bind-mounted as /var/run/docker.sock and was group-writeable.

VMitov commented 7 years ago

Any update on that?

tfhartmann commented 6 years ago

I've also run across this and was wondering about the status?

Vetal-ca commented 6 years ago

I have this problem too. Currently alleviated by allowing write access for others

sudo chmod o+w /var/run/docker.sock

After that it worked.

What I've tried to set for consul/healthcheck container startup:

# make consul a member of docker group
if [ -S /var/run/docker.sock ]; then

#

  echo "Adding consul user to docker group"
  # group id of docker socket
  gid=$(stat -c %g /var/run/docker.sock)

  docker_group=$(getent group ${gid} | sed 's/:.*//')

  if [ 'docker' != "${docker_group}" ]
  then
    # http://blog.zot24.com/tips-tricks-with-alpine-docker/
    echo "Docker group id ${gid} is occupied by group ${docker_group}. Deleting it"
    delgroup ${docker_group}
  fi

  # if 'docker' user/group in the container is not of docker gid
  if ! getent group ${gid} >/dev/null; then
    # adding group "docker"
    addgroup -g ${gid} docker
  fi

  docker_group=$(getent group ${gid} | sed 's/:.*//')
  echo "Docker group: ${docker_group}"
  echo "Consul groups:"
  groups consul
  adduser consul ${docker_group}

  echo "Consul groups after update:"
  groups consul
fi

This didn't help. With or without "--group-add 999" on consul/healthcheck

olivertappin commented 4 years ago

Any update on this one? We're experiencing the same issue, and running:

sudo chmod o+w /var/run/docker.sock

feels extremely dirty.

mkeeler commented 4 years ago

I think there are 3 options in total:

  1. As mentioned before is to allow broader access to docker.sock. For security reasons this is not ideal.
  2. Allow the consul user in the container (uid 1000) to have access to the file. There are some odd interactions here between the hosts view of users and the containers view.
  3. Set CONSUL_DISABLE_PERM_MGMT=true. That will prevent the entrypoint script from dropping privs to the consul user and remain running as the user that the container was started as. Ideally the container still wouldn't execute as root so when running it you could docker run --user <some user with access to docker.sock> -e CONSUL_DISABLE_PERM_MGMT=true ...

Personally, I would choose the third option. It does also mean that you will have to ensure that any volume mounted to the data directory is writable by that user and the volume mounted to the config directory is readable by that user.

eculver commented 3 years ago

Is there still interest in pushing this forward? If not, given that it doesn't have exactly a clear consensus on the path forward and considering it's been over a year since we've had any update, I'm considering just closing it out and revisiting when/if the time comes.

I'll let this simmer for a bit.