cds-snc / notification-planning

Project planning for GC Notify Team
5 stars 0 forks source link

Reduce alarms noise filtering out duplicate key errors #626

Open jimleroyer opened 2 years ago

jimleroyer commented 2 years ago

Description

We get too much noise in the #notification-ops channel with unique key violation errors. These are expected as we use standard SQS queues which do not guarantee to send one and only one message. Hence finetuning the alarm to filter out a minimal number of duplicate should be be OK while raising an alarm if too many are reported in a short period of time (i.e. a few minutes).

As a person on support, I need to have less alarms concerning inconsequential duplicate errors key so that I can achieve focus on higher priority alarms.

WHY are we building?

Less noise in the notification-ops channel and more focus on important alarms.

WHAT are we building?

Better alarms filter.

VALUE created by our solution

More focus and less distraction.

Acceptance Criteria** (Definition of done)

QA Steps

Additional context

Patrick wrote:

Yeah, it’s just a whole wack of the expected unique violation warnings: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "notifications_pkey" I think it would be a good idea to filter these out into their own alert category so we don’t get all the alarm messages for them: https://github.com/cds-snc/notification-api/issues/1458#issuecomment-1108492008

yaelberger-commits commented 2 years ago

@jimleroyer Is this still happening, duplicate key errors?

yaelberger-commits commented 1 year ago

Hey team! Please add your planning poker estimate with Zenhub @andrewleith @jimleroyer @jzbahrai @Pensai @sastels @smcmurtry

jimleroyer commented 1 year ago

@yaelberger-commits This topic came back this week. Definitely still valid and stealing support focus.

yaelberger-commits commented 1 year ago

Revisit this in January or February to evaluate if changes to New Relic did the job or if more effort is needed

yaelberger-commits commented 1 year ago

@jimleroyer @jzbahrai Has this issue been resolved or are we still seeing too many duplicate key errors?

yaelberger-commits commented 1 year ago

@jimleroyer @jzbahrai Has this issue been resolved or are we still seeing too many duplicate key errors?