CDCgov / trusted-intermediary

Bringing together healthcare providers by reducing the connection burden.
Apache License 2.0
9 stars 2 forks source link

TI Internal Error Monitoring #1143

Open sfradkin opened 1 month ago

sfradkin commented 1 month ago

Story

As an Intermediary engineer, so that I can notify CA about any errors that occur, I need a way to identify error that occur during intermediary processing within the TI service.

Pre-conditions

Acceptance Criteria

Tasks

Research

Engineering

Definition of Done

Research Questions

Decisions

Notes

luis-pabon-tf commented 3 weeks ago

Tied to #500, might be blocked at the moment. Will update after discussion...

luis-pabon-tf commented 3 weeks ago
jcrichlake commented 3 weeks ago

Splunk integration shouldn't be required for this card. The Splunk integration is to satisfy a separate CDC log aggregation requirement. For this card we probably just need to add some error based queries to our logs.tf file in TI. There are 2 example queries already in that file, you just have to tinker in the Azure portal to figure out what would be a helpful query or a helpful set of queries for the above mentioned scenario

basiliskus commented 3 weeks ago

We can use this log query to get the errors in the azure logs:

AppServiceConsoleLogs
| project JsonResult = parse_json(ResultDescription)
| evaluate bag_unpack(JsonResult)
| where level == 'ERROR'
basiliskus commented 10 hours ago

People in the team that were tasked with adding the query are still waiting for access to the azure resources needed. I'll look into this ticket today to check if I can add it

basiliskus commented 4 hours ago

I added the terraform with the query and tested in the dev environment. Tomorrow will deploy to rest of the environments