hyperlane-xyz / hyperlane-monorepo

The home for Hyperlane core contracts, sdk packages, and other infrastructure
https://hyperlane.xyz
Other
311 stars 342 forks source link

Add monitoring and alerting for hardware usage of kubernetes pods #4109

Closed daniel-savu closed 2 months ago

daniel-savu commented 3 months ago

Problem

We only noticed the recent Gas Escalator roll out issue by observing a weird pattern in the RC prep queue dasbhoard. Ideally we should alert on increased resource usage that is sustained over at least 1h - to filter out startup CPU spikes and similar. the hyperlane and neutron contexts should be high severity and rc can be low severity

tkporter commented 3 months ago

See https://abacusworks.grafana.net/d/FSR9YWr7k/containers?orgId=1&refresh=1m

daniel-savu commented 2 months ago

done by @tkporter in https://github.com/hyperlane-xyz/hyperlane-monorepo/pull/4158 and then by creating these alerts: