buildkite / lifecycled

A daemon for responding to AWS AutoScaling Lifecycle Hooks
MIT License
146 stars 34 forks source link

Feature request: ECS graceful shutdown/instance draining #54

Open itsdalmo opened 5 years ago

itsdalmo commented 5 years ago

When running ECS clusters on AWS, a zero-downtime rolling update of the underlying VM's can only be done gracefully by using lifecycle hooks and calling UpdateContainerInstancesState to set the instance state to DRAINING, and then waiting for it to have zero running tasks before completing the lifecycle action.

This pattern is shown here using a Lambda:

I'm wondering if perhaps this is something that could be handled by lifecycled instead. As always there are some pros and cons for users by doing it this way instead of using a Lambda...

Pros

Cons

If this is something you think belongs in Lifecycled (and it seems like a good practice), I think we could add a new flag --ecs-cluster and implement a new handler (ECSHandler?) which would drain the instance before completing the lifecycle hook. We could probably hardcode it to run before the FileHandler (aka the handler script).

What do you think @lox?

lox commented 5 years ago

Yup, I love that idea!

lox commented 5 years ago

The other thing that would be neat is hibernation support: https://github.com/aws/ec2-hibernate-linux-agent