[WIP] Clean up images - Githubissues

toolmantim commented 8 years ago

This cuts everything down to three images that can more easily be built locally:

docker build -t buildkite/agent:stable stable
docker build -t buildkite/agent:unstable unstable
docker build -t buildkite/agent:experimental experimetal

Things that are gone:

Ubuntu
Docker: I suggest we move this to buildkite/agent-docker where we can go crazy with Docker version tags, etc
curl and wget

Things still needing to be done:

[ ] Test scripts
[ ] Publish scripts

toolmantim commented 8 years ago

@lox @blueimp @yoshuawuyts @rimusz do you have any thoughts on this?

Also this is kinda blocked on #37 as well, although that's going to have to be dealt with regardless.

blueimp commented 8 years ago

I like it, as it's simpler and provides a reproducible Dockerfile.

There's just one minor caveat with the following code:

ADD https://download.buildkite.com/agent/unstable/$BUILDKITE_AGENT_VERSION/buildkite-agent-linux-amd64 /usr/local/bin/buildkite-agent
RUN chmod +x /usr/local/bin/buildkite-agent

This will double the size of the buildkite-agent in the docker image, as it creates a second layer with a changed binary. If the binary's size is negligible thats' ok, else I'd maybe keep curl in and do the download and chmod in one RUN call.

toolmantim commented 8 years ago

@blueimp ah nice! I thought you might catch something like that.

blueimp commented 8 years ago

You're welcome. We're quite fond of buildkite and I'm happy to help. :)

Just last week I was working on our custom buildkite-agent and considered to base it on your official one. With the changes proposed in this PR, I'd definitely do it, as it makes it very easy to extend.

One more suggestion I have is to default the user of the buildkite-agent to an unprivileged one. Although there are now user namespaces in Docker, they are not enabled by default and might not be available on every platform.

We use the following Dockerfile instructions for this:

# Add buildkite user/group with uid/gid 1000:
RUN adduser -D -u 1000 buildkite

# Run the entrypoint scripts and start tini and the buildkite-agent as buildkite user:
ENTRYPOINT ["entrypoint", "su-exec", "buildkite", "/sbin/tini", "-g", "--", "buildkite-agent"]

su-exec is a tool like gosu.

entrypoint is a super simply script that allows to provide setup scripts in a directory (/usr/local/etc/entrypoint.d in this case):

#!/bin/sh
# shellcheck shell=dash

for file in /usr/local/etc/entrypoint.d/*; do
  [ -x "$file" ] && "$file";
done;

exec "$@"

One of the scripts in /usr/local/etc/entrypoint.d is the following, that allows the unprivileged user to run docker commands:

#!/bin/sh
# shellcheck shell=dash

set -e

# Retrieve the ID of the docker socket owner group:
DOCKER_GID=$(stat -c '%g' /var/run/docker.sock)

# Get the group name for this GID on this system:
DOCKER_GROUP=$(getent group "$DOCKER_GID" | cut -d: -f1)

# If the GID does not exist on this system, create a group:
if [ -z "$DOCKER_GROUP" ]; then
  DOCKER_GROUP=docker
  addgroup -g "$DOCKER_GID" "$DOCKER_GROUP"
fi

# Add the buildkite user to the docker socket owner group:
addgroup buildkite "$DOCKER_GROUP"

One might argue that docker users are alike to root, so there's no security benefit here. The main reason for us to do this though was permission problems in host mounts, as without user namespaces, the docker-based buildkite-agent checks out repositories as root.

toolmantim commented 8 years ago

Thanks. That was the intention of #18 and don't see why we couldn't do it in this new one.

Why all the Docker stuff though? Are you including Docker daemon in your own image?

blueimp commented 8 years ago

No, we're mounting the host docker socket into the buildkite container. The scripts are there to make sure the buildkite user is allowed to write to this docker socket.

btw. I would definitely use su-exec over sudo to drop privileged for reasons outlined on the gosu homepage.

yoshuawuyts commented 8 years ago

Cool to see an alpine based image. Might be worth adding the following perhaps:

https://github.com/lukasmartinelli/hadolint docker file linting, as an easy first step to catch errors automatically
https://github.com/jwilder/docker-squash, squash layers so that pruning can be done. Solves some of the issues mentioned earlier about layers stacking up

This will def mess with my apt-get install scripts I'm running, but I'll find a way around it. Rad! :sparkles:

blueimp commented 8 years ago

Hi @toolmantim, we're recently made our buildkite-agent repo (both on GitHub and Docker) public, so it's here if you wanna have a look at the stuff we're doing with it: https://github.com/allthings/buildkite-agent

sj26 commented 8 years ago

Nice idea to set a uid!

I'm a little worried about using UID 1000 for all docker users because we don't know how uids are allocated on the host system, and a typical pattern is to volume map in the builds directory etc, which might leave files belonging to an unintended user lying about on the host. I'm not sure this is worse than leaving root-owned files about the place, but it might mislead folks that the setup is secured.

Perhaps we can find another default. The nobody user is usually the last uid, 65534? But sometimes it can be 99? Or just choose something typically before the user-allocated uids, like 99 or 999?

lox commented 8 years ago

Will the lack of docker (client) and wget / curl drive people nuts?

blueimp commented 8 years ago

@sj26 The reason for setting the uid to 1000 is this boot2docker (docker machine) issue: https://github.com/boot2docker/boot2docker/issues/581 Basically it allows us to do avoid permission issues with host mounts for developers on OSX.

Since Docker has user namespaces now, using a different user than root is less of a security benefit. For us it's mainly a development benefit.

sj26 commented 8 years ago

@blueimp right, that's super annoying. We had to work around it specifically in the Elastic CI Stack. I don't think we can roll that into the main distribution though because not all users are running Docker for Mac / boot2docker / etc. so UID 1000 might not be as intended. Recommending user namespaces would be good, though!

lox commented 8 years ago

User namespaces get complicated quickly when the container creates multiple users, which happens in sufficiently complicated containers fairly often.

lox commented 7 years ago

I'm keen to see this happen, the current images are already quite behind docker.

lox commented 7 years ago

@toolmantim bump! :)

toolmantim commented 7 years ago

I'd like to get this done too, but there's a few things that still need to be figured out:

What do the images that include Docker look like? That still needs to be spiked out.
What's the migration plan for removing Docker from the images? Lots of people are using this image on Kubernetes and passing through the Docker socket. They'll need to update their Kubernetes setups, or maybe everything might break?

The full list of tags we have currently are here: https://hub.docker.com/r/buildkite/agent/tags/

lox commented 7 years ago

My take would be to include the Docker 1.13 client, which introduced interoperability with older server versions. I'd also include curl.

I'm starting to come around to @blueimp's /usr/local/etc/entrypoint.d stuff too. If you folks would like I could contribute to this?

blueimp commented 7 years ago

@lox Thanks for the shoutout :)! I think the entrypoint.d approach would be a nice way to avoid hard-coding the ssh-env-config.sh into the ENTRYPOINT declaration. Basically it would be similar to the buildkite hooks, in that there's a config folder of scripts analogue to the hooks that is executed on startup vs on builds.

As a side-note, we're probably moving away from the docker image, as by now we would definitely benefit from the auto-scaling buildkite/elastic-ci-stack-for-aws. The main benefit we get from the docker image based agent is packaging exactly the software and config tools we need. But then, everything that's not included as binary in the elastic-ci-stack image could also be packaged up as one-purpose docker images.

P.S.: I just realised you're not actually working for BuildKite. Due to all your contributions I thought you're already part of their company. :)

lox commented 7 years ago

@blueimp my renewed interest in the docker image is that I'd really like buildkite/elastic-ci-stack-for-aws to use docker images for the agents. Ideally this lets us iterate on the agent versions / images quicker than the aws stack images.

How would you feel about that?

blueimp commented 7 years ago

That would combine the advantages of both, so I'm all for it. :)

toolmantim commented 7 years ago

Superseded by #44

buildkite / docker-buildkite-agent

[WIP] Clean up images #38