eosnetworkfoundation / engineering

A workspace for documentation by Engineering primarily regarding process
MIT License
0 stars 0 forks source link

Fix EVM CloudWatch Alerts #66

Closed kj4ezj closed 1 year ago

kj4ezj commented 1 year ago

Customers have not been getting email alerts surrounding EVM cloud infrastructure health check status changes. The email alerts are working with test events, so there must be something wrong with CloudWatch or between CloudWatch and the SNS topic. Debug and fix this so alerts start flowing again.

See Also

engineering issue 68 - EVM Monitoring and Alerting - Phase 1

  1. eos-evm issue 602 - Funnel EVM Health Checks into CloudWatch
  2. engineering issue 48 - Collaborate with Operations on Unified Dashboarding Solution
  3. engineering issue 49 - Create Bot to Alert via IM on Specific Metrics
  4. eos-evm issue 603 - SMS Alerting for EVM Infrastructure Health Checks
  5. engineering issue 65 - Email Alerting for EVM Infrastructure Health Checks
  6. telegram-bot issue 1 - Open-Source This Repo
  7. engineering issue 57 - Create Telegram Service Account
  8. engineering issue 58 - Create EVM Testnet Alert Channel Using Telegram Service Account
  9. engineering issue 64 - Create EVM Mainnet Alert Channel Using Telegram Service Account
  10. engineering issue 66 - Fix EVM CloudWatch Alerts
  11. telegram-bot issue 2 - Human-Friendly Alerts
  12. engineering issue 71 - EVM Alerts for APAC Infrastructure
  13. telegram-bot issue 3 - Alert Bot Maintainer via Telegram on Errors
wanderingbort commented 1 year ago

This is considered done

kj4ezj commented 1 year ago

Alerts are flowing in US region(s), but there are technical limitations preventing alerts for APAC infrastructure from flowing. These technical limitations will be addressed in a subsequent ticket, linked above and in the parent epic. This does not impair EVM readiness because of the way the system is architected. APAC infrastructure outages will failover to US infrastructure, which saw alerts fixed as part of this ticket.