ministryofjustice / cloud-platform

Documentation on the MoJ cloud platform
MIT License
87 stars 45 forks source link

Create alert when low/no IP prefixes available to live cluster #6311

Open kyphutruong opened 1 month ago

kyphutruong commented 1 month ago

Background

We now have vpc-cni/ipamd logs going to [opensearch](https://app-logs.cloud-platform.service.justice.gov.uk/_dashboards/app/data-explorer/discover#?_a=(discover:(columns:!(_source),isDirty:!f,sort:!()),metadata:(indexPattern:'6d5c66d0-8d35-11ef-a6ba-a7191f5fb1c2',view:discover))&_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-15m,to:now))&_q=(filters:!(),query:(language:kuery,query:'')))

Set up alerts for when a node has CNI errors for prefix allocation and/or low/no IP prefixes available to live cluster

Proposed user journey

Alerts going to low-priority channel when low/no IP prefixes available to live cluster

Approach

Which part of the user docs does this impact

Communicate changes

Questions / Assumptions

Definition of done

Reference

How to write good user stories

kyphutruong commented 1 week ago

I think we can use the query with string InsufficientCidrBlocks as a criteria for the alert.

https://moj.enterprise.slack.com/archives/C05RE26R8TW/p1695299579240409

https://moj.enterprise.slack.com/archives/C05RE26R8TW/p1695289492516869