kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
112.89k stars 40.17k forks source link

RFC: Deployment hooks #14512

Open ironcladlou opened 9 years ago

ironcladlou commented 9 years ago

Users often need a means to inject custom behavior into the lifecycle of a deployment process. The deployment API (#1743) could be expanded to support the execution user-specified Docker images which are given an opportunity to complete at various points during the recon process for a deployment.

Use cases and various design approaches were discussed previously in an OpenShift deployment hooks proposal.

This RFC is to capture initial thoughts on the topic and to link any existing related issues.

ironcladlou commented 9 years ago

cc @smarterclayton @nikhiljindal @ghodss @bgrant0607

pmorie commented 9 years ago

Hooks be like

:arrow_right_hook: :leftwards_arrow_with_hook:

smarterclayton commented 9 years ago

A hook is a materialization of the "process" a deployment requires. A hook is a way of reducing the cost of implementing a full custom deployment process - instead, at logical points in the flow control of the process is handed off to the control of user code. Practically, hooks often involve one way transitions in state, such as a forward database migration, removal of old values from a persistent store, or the clearing of a state cache. A hook is effectively a synchronous event listener with veto power - it may return success or failure, or in some cases need reexectution.

Because deployment processes typically involve coupling between state (assumed elsewhere) and code (frequently the target of the deployment), it should be easy for a hook to be coupled to a particular version of code, and easy for a deployer to use the code under deployment from the hook.

Not all deployment processes are equal - most code deployments are small, with infrequent larger deployments that require schema or state changes. It should be easy to reason about when hooks will get run as well as temporarily disable them. Important deployments are usually manual, therefore there is less motivation to build hook enable/disablement mechanisms - it is usually better to allow automatic deployment and hooks to be disabled and an imperative series of actions taken instead.

bgrant0607 commented 9 years ago

Related: #3585

gravis commented 9 years ago

Let me sum-up our recent experience with openshift here. We're using a dc to deploy a nginx + rails pod (2 containers). The 2 containers have an EmptyDir in common to share assets, generated by sprockets in rails. Rails was previously deploy using capistrano, and a recipe was in charge of pre-compiling these assets (http://guides.rubyonrails.org/asset_pipeline.html#precompiling-assets). This is a good example of where a pre would fit. In dev, we don't precompile assets, therefore the container command is just "unicorn [...]" (a ruby webserver). In production, we need to exec a pre-hook to compile these assets, before unicorn is starting. This is where we felt into that issue: https://github.com/openshift/origin/issues/4711#issuecomment-158090169

bgrant0607 commented 9 years ago

Example of a pre-rollout hook: https://github.com/kubernetes/kubernetes/issues/1899#issuecomment-158634432

smarterclayton commented 9 years ago

This could also be done with the future init container proposal, where the init container on each pod gets a chance to run arbitrary code to fill out volumes.

On Nov 19, 2015, at 11:17 AM, Philippe Lafoucrière notifications@github.com wrote:

Let me sum-up our recent experience with openshift here. We're using a dc to deploy a nginx + rails pod (2 containers). The 2 containers have an EmptyDir in common to share assets, generated by sprockets in rails. Rails was previously deploy using capistrano, and a recipe was in charge of pre-compiling these assets ( http://guides.rubyonrails.org/asset_pipeline.html#precompiling-assets). This is a good example of where a pre would fit. In dev, we don't precompile assets, therefore the container command is just "unicorn [...]" (a ruby webserver). In production, we need to exec a pre-hook to compile these assets, before unicorn is starting. This is where we felt into that issue: openshift/origin#4711 (comment) https://github.com/openshift/origin/issues/4711#issuecomment-158090169

— Reply to this email directly or view it on GitHub https://github.com/kubernetes/kubernetes/issues/14512#issuecomment-158104943 .

bgrant0607 commented 9 years ago

A hook use case is described here: https://groups.google.com/forum/#!topic/google-containers/zblnAzLeSJA

edevil commented 8 years ago

I'm also having difficulty deciding when/how to do database migrations in the context of a rolling-upgrade. It seems these hooks would be the perfect place for running the migration script.

wombat commented 8 years ago

@edevil well, as you can´t have breaking schema changes in a rolling-upgrade scenario you can also use K8s jobs to do this. You can either wait for it to be completed before doing the upgrade or trigger them at the same time

0xmichalis commented 8 years ago

I'm also having difficulty deciding when/how to do database migrations in the context of a rolling-upgrade. It seems these hooks would be the perfect place for running the migration script.

This is the place where a mid-hook with the Recreate deployment strategy makes sense

cc: @kubernetes/deployment

smarterclayton commented 8 years ago

Yes, that's what mid hooks are primarily geared for.

An open question is how hooks might interact with petsets as well. As part of PetSet we've been discussing how membership changes are signaled to the cluster, but in an update scenario there will need to be an orchestrated series of steps (some of which require the processes to be stopped). So I would argue that the general use case for both deployments and petsets must involve some sort of "associated change", which might be hooks, or a babysitter process, or a custom controller. It's better if those hooks are re-entrant and frequently applied (since that makes their behavior fit a control loop style, rather than a workflow style)

F21 commented 8 years ago

Any updates for this one? I am interested in being able to run a deployment and being able to start a chain of init containers for the deployment (not on a per pod basis).

edevil commented 8 years ago

Yeah, init containers per-deployment would be a good option as well.

0xmichalis commented 8 years ago

It's better if those hooks are re-entrant and frequently applied (since that makes their behavior fit a control loop style, rather than a workflow style)

Agreed. FWIW, the way we ensure rollbacks are re-entrant for Deployments is having an API field that when populated, the controller will remove and update the pod template to the specified version in one atomic call.

dhilipkumars commented 7 years ago

Proposed deferContainers would give better support for Cleanup actions like db-migration etc.,

0xmichalis commented 7 years ago

Proposed deferContainers would give better support for Cleanup actions like db-migration etc.,

@dhilipkumars deferContainers are executed in the pod level and are equivalent to post-hooks. This issue is about hooks at the deployment level and it includes other types of hooks too (pre, mid).

dhilipkumars commented 7 years ago

@kargakis deferContainers are proposed to be Prestop so db-migration can be programmed easily using it. If reasons for these hooks are towards 'better support for stateful apps' shouldnt we think of doing these in statefulsets instead?

0xmichalis commented 7 years ago

There are more reasons for hooks than db migrations, eg. image promotion. Although, if we have auto-pausing in the workload controllers (https://github.com/kubernetes/kubernetes/issues/11505), we can satisfy the use cases of this issue by programming hooks on top of the existing APIs.

montanaflynn commented 7 years ago

Not sure if this is helpful or even related but here's our use case:

We want to get an http request webhook anytime a deployment changes, this will allow us to correlate errors and changes to metrics with deployments.

gkop commented 7 years ago

We're using a very vanilla setup on GKE with the default rolling deployments. It's surprising that there's apparently not a simple and conventional way to get a hook when the rolling deploy completes. Would anybody be so kind to share a workaround for this? <3

2rs2ts commented 7 years ago

We have a very similar use case as @montanaflynn and deferContainers would not solve it.

gkop commented 7 years ago

Our workaround is to spawn kubectl rollout status deploy/$DEPLOYMENT_NAME --watch=true | tail -n 1 and wait for the process to exit. We use the process exit status to determine whether or not the deploy was successful, and include the last line of output in our deploy notifications (usually "deployment $DEPLOYMENT_NAME successfully rolled out") #hacktastic

486 commented 7 years ago

Hi,

Any updates on this? We are using OpenShift but also want to be compatible with Vanilla Kubernetes. Loosing Deployment Hooks is the single biggest headache in that transition.

It is just so common to have older software that expects migrations or other setup code to be executed by exactly one process, which can be easily done in the pre-deployment hook.

From a user perspective, I find it very helpful to have deployment hooks as a first-class feature. "Your deployment failed because your migrations, which happen in your pre-hook, didn't run through, you should check that."

As a workaround, Helm can orchestrate Jobs that have to run before your Deployment, but we don't want to adopt more moving parts for this single feature.

edevil commented 7 years ago

Just to add more fuel to the discussion, I was recently bitten by a problem using initContainers for running Django migrations.

During one of the rolling upgrades several pods ran the "./manage.py migrate" script at the same time through the "initContainers" feature. Since Django does not do every migration operation inside a transaction, some of the operations were interleaved which did not end well.

As I really need to run this script once per rolling-deployment (recreate is not an option), I am left with no other option but to do it manually... Launch a Job, check on it periodically, make sure it has run successfully, and only after perform the deployment.

ironcladlou commented 7 years ago

@486 @edevil the discussion continues over here: https://github.com/kubernetes/community/pull/1171

bgrant0607 commented 7 years ago

/lifecycle frozen

shenshouer commented 5 years ago

any update for this ?

galindro commented 5 years ago

@ironcladlou did you know where in https://github.com/kubernetes/enhancements the Proposal https://github.com/kubernetes/community/pull/1171 was moved into?

alper commented 7 months ago

Is this still open or has this been completed somewhere?

sftim commented 1 month ago

The deployment API (https://github.com/kubernetes/kubernetes/issues/1743) could be expanded to support the execution user-specified Docker images which are given an opportunity to complete at various points during the recon process for a deployment.

How about CEL as an alternative? Maybe MutatingAdmissionPolicy lets us build this now?