ActoKids / AD440_W19_CloudPracticum

3 stars 1 forks source link

Alerts and emails for failures #74

Open mrvirus9898 opened 5 years ago

mrvirus9898 commented 5 years ago

Our crawlers are well built, but even the best programs can fail. And not always due to their own faults. An alert system will need to be made that issues alerts when our functions fail their execution. The alert system could be script based, or based on some AWS settings. You are not limited on what you can alert on, however the alert must be sent over email.

Please indicate the time spent on this, any issues that you are having, any good references you found for this subject, and credit anyone helped you out.

TyReed12 commented 5 years ago

Update: Estimated Time to be spent 6 hours.

In the process of creating alarms. Messaged DevOps to get permissions to create notification list (List of users that are notified when alarm is triggered).

Researching how to configure the following Alarms: 1) Crawler Trigger Failed (Trigger not invoked) 2) Crawler function running longer than expected. 3) Crawler invoked but failed 4) Writing to Dynamo Error

TyReed12 commented 5 years ago

Documenting permissions bug: Added Anar from DevOps to task.

TyReed12 commented 5 years ago

Created Issue for permissions problem and marked it as bug. Assigned several members from devops. Permissions Bug with creating Alarms in Cloudwatch #15

TyReed12 commented 5 years ago

@dashinay has fixed the permissions issue. Continuing work on Alarms. So far I've created a notification list named CrawlerAlarms with my email listed. I've created an alarm to make sure both the googlecalcrawler and ebcrawler are invoked every 24 hours. If they are not invoked, the alarm state will change to Alarm and send me a notification email.

subscription confirmed

TyReed12 commented 5 years ago

alarmtest1 Confirmation that Alarm and Notification List working

TyReed12 commented 5 years ago

cpualarm Confirmation that CPU Utilization Alarm is working

TyReed12 commented 5 years ago

Issue Wrap Up:

Created the four following alarms:

SSCrawlerErrorAlarm- this alarm sends a notification using amazon's SNS service whenever t he SSCrawler lambda function ends in error.

CPUUtilizationAlarm - this alarm sends a notification whenever the EC2 utilization averages over 80% within 15 minutes.

GoogleCrawlerErrorAlarm - this alarm sends a notification using amazon's SNS service whenever t he GoogleCrawler lambda function ends in error.

IsGoogleCrawlerInvoked - this alarm sends a notification whenever the GoogleCrawler function hasn't been invoked over a 24 hour period.

alarmspage

Estimated time to be spent: 6 hours