document CloudWatch alarms #607

Open Nemo64 opened 4 years ago

Nemo64 commented 4 years ago

I'd like documentation about CloudWatch alarms.

I'm not fluent in that topic yet but i'll have to experiment with it in the future so I can add something in the coming weeks.

I mean something like this CloudFormation template:

            Type: AWS::SNS::Topic
                    -   Endpoint:
                        Protocol: "email"
            Type: AWS::CloudWatch::Alarm
                ActionsEnabled: true
                    - !Ref EmailTopic
                AlarmDescription: HTTP 4xx
                AlarmName: HTTP 4xx
                ComparisonOperator: GreaterThanThreshold
                DatapointsToAlarm: 1
                EvaluationPeriods: 1
                    - Name: ApiName
                      Value: ${self:provider.stage}-${self:service}
                MetricName: 4XXError
                Namespace: AWS/ApiGateway
                Period: 60
                Statistic: Sum
                Threshold: 1
                TreatMissingData: notBreaching

One thing that I think is especially important is to monitor is lambda scaling since the default account limit is 1000 concurrent instances and 1000 lambda instance running none-stop will be very expensive and while amazon claims that CloudFront protects agains d-dos attacks, I still want to know if someone runs an ab -c 1000 on my endpoint.... which isn't really a d-dos.

victormacko commented 4 years ago

There's a serverless plugin called 'serverless-plugin-aws-alerts' which might help here.

Nemo64 commented 4 years ago

Good hint, although by quickly looking over it, it seems like a slightly different syntax for vanilla alarms. I think I have to experiment more with it to see what problems that plugin tries to solve.

mnapoli commented 4 years ago

That's interesting, I would be 👍 to merge a documentation about this.

This could fit well as a new article in the Environment section:

One thing to keep in mind if you put some examples: explain clearly what the alerts are about. The code example you posted is interesting, but to be honest it's hard to clearly understand when the alarm triggers exactly.