buildkite-plugins / docker-buildkite-plugin

🐳📦 Run any build step in a Docker container
MIT License
113 stars 106 forks source link

Cancelled jobs leave containers running #127

Closed lox closed 5 years ago

lox commented 5 years ago

We've had reports that long running scripts run in the docker plugin leave the container running when the job is cancelled.

I verified this with this gist: https://gist.github.com/lox/045a4b56a0c1e1c815fd011657c34b46/708574c0b0ff9ec8748d2cd736de31951da555cb

The output looks like:

image

For some context, the agent initiates cancellation and sends a SIGTERM to the process group of buildkite-agent bootstrap that is executing the job.

It appears that docker run doesn't like being SIGTERM'd and terminates without stopping the container. My previous understanding was that docker run would proxy signals through to the container, and whilst there were some caveats around how pid 1 operates, it should be ok with the --init flag or a tini entrypoint, however, that doesn't seem to be accurate.

lox commented 5 years ago

https://github.com/moby/moby/issues/9098#issuecomment-347536699:

The root cause is when using --tty signal proxying is entirely disabled with no way to enable it (even with --sig-proxy).

🤦🏼‍♂️

lox commented 5 years ago

Breakage was introduced in 2013, and there is an open patch to moby at https://github.com/docker/cli/pull/1841.

lox commented 5 years ago

I'll fix this with a pre-exit hook that calls:

docker kill --signal=SIGTERM my_container
docker rm -v my_container