ministryofjustice / modernisation-platform

A place for the core work of the Modernisation Platform • This repository is defined and managed in Terraform
https://user-guide.modernisation-platform.service.justice.gov.uk
MIT License
683 stars 290 forks source link

Improve monitoring of Transit Gateway attachments #2020

Open dms1981 opened 2 years ago

dms1981 commented 2 years ago

User Story

As a Modernisation Platform Engineer I want to receive alerts when Transit Gateway events occur So that I can resolve issues that could impact Modernisation Platform Customers

User Type(s)

Value

With active monitoring of Transit Gateway events through AWS Network Manager, we can take action to resolve issues before they are reported to us by our customers.

Questions / Assumptions / Hypothesis

Questions

Where would we send the alerts? Pagerduty? Slack

Hypothesis

At present we make use of metric-based alerts, which are only appropriate when VPC attachments are in active use. By using Network Firewall events we can monitor what we're really interested in, without the overhead of needing to generate traffic for VPCs without active customers using them.

Proposal

Look into AWS Network Manager as a way of monitoring Transit Gateway events, as these events are not - so far as I know - monitor-able directly against Transit Gateway

Definition of done

Reference

How to write good user stories https://docs.aws.amazon.com/vpc/latest/tgwnm/monitoring-events.html#network-topology-events

davidkelliott commented 2 years ago

This may also improve the transit gateway monitoring - https://aws.amazon.com/about-aws/whats-new/2022/07/amazon-vpc-flow-logs-transit-gateway-improved-visibility-monitoring/

dms1981 commented 1 year ago

This looks to only be partially available through Terraform. I can see that it's possible to create a Networkmanager site through terraform, but I don't see a way to onboard that into CloudWatch Insights through Terraform.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 90 days with no activity.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 90 days with no activity.

github-actions[bot] commented 1 month ago

This issue is stale because it has been open 90 days with no activity.