cds-snc / notification-planning-core

Project planning for GC Notify Core Team
0 stars 0 forks source link

Create a threshold alarm on unsuccessful DNS resolution issues #73

Open jimleroyer opened 1 year ago

jimleroyer commented 1 year ago

Description

As an ops lead of GCNotify, I want to know when there are DNS resolution issues, So that I can be proactive on failing components of the stack.

WHY are we building?

We want to know of any DNS issues we might run into early, as these tend to propagate higher in the stack.

WHAT are we building?

An alarm that detects a certain threshold of unresolved DNS requests.

VALUE created by our solution

Increased observability and proactive steps.

Acceptance Criteria

QA Steps

Info

This is an action item following the incident 2023-04-07-could-not-translate-read-only-db-proxy-host-name.

sastels commented 1 year ago

Will require investigation: how do (can?) we do this?

Can possibly scrape the k8s (or other?) logs to find DNS problems?