buildkite / lifecycled

A daemon for responding to AWS AutoScaling Lifecycle Hooks
MIT License
146 stars 34 forks source link

Add a "happy path" for shutting down the daemon... #42

Closed itsdalmo closed 6 years ago

itsdalmo commented 6 years ago

Ref my comment here: https://github.com/buildkite/lifecycled/pull/40#issuecomment-418678672

I'd like to flesh out how I'd like to fix this, since I suspect it'd involve some refactoring that you may/may not be on board with and would possibly make the PR hard to review as a whole. So it's probably a good idea to flesh out the overall "plan" before I start work:

  1. I'd like daemon.Start() to return a completeFunc which is what actually sends the CompleteLifecycleAction request and can be invoked from main.go. The reason for this is that we'd like the queue to be cleaned up before the interrupt signal is received (giving us more time for retries etc).
  2. daemon.Start() will need to be rewritten to support a "happy path" which does not require a signal in order to shut down. While I'm at it/since the two go-routines in daemon.Start() (i.e. go func()...) essentially do the same thing, they invoke the handler if a termination notice is received. With some refactoring I believe this could be simplified a bit and hopefully become easier to reason about "the happy path":
// Note that the polling functions will close the channel when the context is cancelled.
terminationNotice := make(chan *sqs.Message)
go d.pollTerminationNotice(ctx, terminationNotice)

spotTermination := make(chan time.Time)
go d. pollSpotTermination(ctx, spotTermination)

var (
  // In reality we'd declare this variable in the function signature?
  completeFunc          func() error
  terminationTransition string
)

Polling:
  for {
    select {
    case ctx.Done():
      return completeFunc, nil
    case n := <- terminationNotice:
      log.Infof("Got a lifecycle termination notice: %v", notice)
      terminationTransition = aws.StringValue(m.Transition)
      hbt := time.NewTicker(heartbeatFrequency)
      go func() {
        for range hbt.C {
          ctx.Debug("Sending heartbeat")
          if err := sendHeartbeat(d.AutoScaling, m); err != nil {
        ctx.WithError(err).Error("Heartbeat failed")
          }
        }
      }
      // completeLifecycle takes care of stopping the hbt when the hook has been completed.
      completeFunc = d.completeLifecycle(d.AutoScaling, hbt, m)
      break Polling
    }
    case n := <- spotTermination:
      log.Infof("Got a spot instance termination notice: %v", notice)
      terminationTransition = "ec2:SPOT_INSTANCE_TERMINATION"
      break Polling
    }
  }

err := executeHandler(d.Handler, []string{terminationTransition, d.InstanceID}, d.Signals)
if err != nil {
  log.WithError(err).Error("Failed to run handler script")
}

return completeFunc, nil
// main.go would have to check whether the completeFunc is nil before invoking it.

(PS: Wrote the code above in the issue so take it as psuedo-code 😎)

  1. Aside from the obvious, the above changes would mean the following refactoring:
    • pollSpotTermination is made into a private method of Daemon.
    • handleMessage, queue.Receive and this go-routine are combined into one function: pollTerminationNotice (also a private method of Daemon). The new function just becomes a loop which polls for new messages, unmarshals and filters the messages, and only passes messages that should be acted on over the channel.
    • Optional: We could also introduce a new type so we can use the same channel for both spotTermination and terminationNotice, and then use a simpler loop inside daemon.Start.

If you'd consider accepting the changes above, I'll be happy to submit a PR that we can discuss further!

itsdalmo commented 6 years ago

@lox: I've started some work here: https://github.com/itsdalmo/lifecycled/commits/add-happy-path

So far I've added 3 quick commits:

These commits lay the groundwork for the "happy path" discussed above. Note that they are very rough and will need some polish before I can submit a PR, should you want it. Let me know!

lox commented 6 years ago

Awesome, will check them out, absolutely submit a PR!

itsdalmo commented 6 years ago

Closed by #45.