mapbox / cloudfriend

Helper functions for assembling CloudFormation templates in JavaScript
ISC License
69 stars 9 forks source link

Force lambda alarm evaluation window to be >= lambda timeout #70

Closed Ramshackle-Jamathon closed 5 years ago

Ramshackle-Jamathon commented 5 years ago

This pr adds logic to the lambda shortcut to check that the configuration provided for the cloudwatch alarm creates an evaluation window that matches or exceeds the configured lambda timeout, if this condition is not met an error is thrown. If no configuration is provided then a valid "M of N" alarm will be generated by cloudfriend.

These changes are to cover a failure mode involving timeouts or late runtime errors for lambdas with a large timeout configured (>5mins). Lambda posts failure datapoints into cloudwatch after the lambda completes but the timestamp associated with the datapoints will be the same as the invoke event. Or put differently, Lambda will insert cloudwatch failure datapoints Y minutes in the past where Y is the runtime of the lambda.

According to aws support there are some undocumented steps that cloudwatch alarms take when there is a missing data points or backfilled events which can sometimes avoid this failure, especially when there is only a few minutes difference between the alarm evaluation window and lambda runtime. But in order to be sure that all lambda failures are captured by the cloudwatch alarm the alarm evaluation window should be expanded to include the full runtime of the lambda.