buildkite-plugins / docker-compose-buildkite-plugin

🐳⚡️ Run build scripts, and build + push images, w/ Docker Compose
MIT License
171 stars 138 forks source link

Make `graceful-shutdown: true` the default behavior #411

Closed jmctune closed 9 months ago

jmctune commented 9 months ago

Hey there,

We just noticed an issue in our CI environment where we had lots of running containers on our CI hosts. This was due to docker-compose standing up containers, but not tearing them down. We just recently discovered the graceful-shutdown flag and noticed it was set to false. Is there a reason why false is the default behavior? Maybe I am thinking more for our use case, but it would make sense to me that when CI has completed its tests (or whatever work), that the containers are no longer necessary and should always be destroyed to clean up any cruft.

Thanks for the consideration.

toote commented 9 months ago

Hi @jmctune, please note that graceful-shutdown only changes how the remaining containers will be cleaned up not if the will be. The default is to use docker compose kill, when the option is turned on it would use docker compose stop instead.

The only reason for it to be false by default is to avoid breaking previous behaviour. As you can see in the PR that added the option, before that it would only use a kill.

If your containers were not being torn down, there must have been another issue that were leaving them around as the cleanup code is executed in the pre-exit hook and should have either stopped or killed.

jmctune commented 9 months ago

Hi @toote, thanks for getting back to me.

I was confused on the graceful-shutdown logic -- sorry about that.

My issue still remains though; we are experiencing an issue with this plugin where containers are not being torn down (they're remaining "Running"), which is causing us some headaches with OOM issues on our CI hosts.

As a workaround, we're implementing logic to delete the running containers ourselves in a pre-exit hook, but do you have any immediate areas to search for on why the plugin wouldn't be performing a docker compose kill?

I am looking through our build logs and I do not see this message submitted:

https://github.com/buildkite-plugins/docker-compose-buildkite-plugin/blob/master/hooks/pre-exit#L17

jmctune commented 9 months ago

Wow, a lot of self-inflicted harm here! This was an issue strictly with some customizations on our end where we just started deleting both the cache directory and agent directory folders to clean up our storage. This was done in a pre-exit hook on our side, but the vendored plugin hooks run after ours, which means the docker-compose pre-exit hook wasn't even aware it needed to run.

This is documented and was quite a learning experience. Sorry for the noise.