This pr adds logic to the lambda shortcut to check that the configuration provided for the cloudwatch alarm creates an evaluation window that matches or exceeds the configured lambda timeout, if this condition is not met an error is thrown. If no configuration is provided then a valid "M of N" alarm will be generated by cloudfriend.
These changes are to cover a failure mode involving timeouts or late runtime errors for lambdas with a large timeout configured (>5mins). Lambda posts failure datapoints into cloudwatch after the lambda completes but the timestamp associated with the datapoints will be the same as the invoke event. Or put differently, Lambda will insert cloudwatch failure datapoints Y minutes in the past where Y is the runtime of the lambda.
According to aws support there are some undocumented steps that cloudwatch alarms take when there is a missing data points or backfilled events which can sometimes avoid this failure, especially when there is only a few minutes difference between the alarm evaluation window and lambda runtime. But in order to be sure that all lambda failures are captured by the cloudwatch alarm the alarm evaluation window should be expanded to include the full runtime of the lambda.
This pr adds logic to the lambda shortcut to check that the configuration provided for the cloudwatch alarm creates an evaluation window that matches or exceeds the configured lambda timeout, if this condition is not met an error is thrown. If no configuration is provided then a valid "M of N" alarm will be generated by cloudfriend.
These changes are to cover a failure mode involving timeouts or late runtime errors for lambdas with a large timeout configured (>5mins). Lambda posts failure datapoints into cloudwatch after the lambda completes but the timestamp associated with the datapoints will be the same as the invoke event. Or put differently, Lambda will insert cloudwatch failure datapoints Y minutes in the past where Y is the runtime of the lambda.
According to aws support there are some undocumented steps that cloudwatch alarms take when there is a missing data points or backfilled events which can sometimes avoid this failure, especially when there is only a few minutes difference between the alarm evaluation window and lambda runtime. But in order to be sure that all lambda failures are captured by the cloudwatch alarm the alarm evaluation window should be expanded to include the full runtime of the lambda.