EC2 failures and graceful shutdown can cause prolonged errors

if EC2 failure occurs, and then the node is terminated by ASG or a person, the hook is received by lifecycle-manager and the drain/deregister flow will start. In this case we will fail to drain for as long as --drain-timeout, this keeps the instance alive in the meanwhile and applications can see errors due to instance still being in target-groups.

We should evaluate whether we should try to deregister-only or skip alltogether when the node state is unknown

keikoproj / lifecycle-manager

EC2 failures and graceful shutdown can cause prolonged errors #44