Closed JohnNKing closed 2 weeks ago
Blocked waiting on access and a meeting to show us the process
Discussing RS proposal today
We now have access to RS logs in staging and production. We'll evaluate if possible to create alerts for error logs
RS is actively working in adding log parameters that will allow us to query the logs filtering by topic, sender, receiver, etc. Here's the PR: https://github.com/CDCgov/prime-reportstream/pull/15263
Currently RS doesn't have the permissions required to create the log alerts and queries. They will look into getting access
I'll mark this story as blocked while RS works on that PR and the ability to create alerts
RS PR has been merged. Removing the blocked
label
Here's a KQL query to filter by the etor-ti
topic, now that the capability has been added to RS:
traces
| extend customDimensionsParsed = parse_json(customDimensions)
| where customDimensionsParsed.TOPIC == '"etor-ti"'
Here's a first version of a KQL query to select and unpack the fields we care about
traces
| project timestamp, message, customDimensions, appId
| extend
messageParsed = parse_json(message),
customDimensionsParsed = parse_json(customDimensions)
| extend
mdc_span_id = messageParsed.mdc.span_id,
mdc_trace_flags = messageParsed.mdc.trace_flags,
mdc_trace_id = messageParsed.mdc.trace_id,
message_content = messageParsed.message,
message_thread = messageParsed.thread,
message_timestamp = messageParsed.timestamp,
message_level = messageParsed.level,
message_logger = messageParsed.logger,
customDimensions_ProcessId = customDimensionsParsed.ProcessId,
customDimensions_Category = customDimensionsParsed.Category,
customDimensions_HostInstanceId = customDimensionsParsed.HostInstanceId,
customDimensions_LogLevel = customDimensionsParsed.LogLevel
| project
timestamp,
appId,
mdc_span_id,
mdc_trace_flags,
mdc_trace_id,
message_content,
message_thread,
message_timestamp,
message_level,
message_logger,
customDimensions_ProcessId,
customDimensions_Category,
customDimensions_HostInstanceId,
customDimensions_LogLevel
Another query with extended fields found in customDimensions:
traces
| extend
messageParsed = parse_json(message),
customDimensionsParsed = parse_json(customDimensions)
| extend
messageTimestamp = messageParsed.timestamp,
messageLevel = messageParsed.level,
messageContent = messageParsed.message,
messageLogger = messageParsed.logger,
messageMdc = messageParsed.mdc,
messageThread = messageParsed.thread,
customProcessId = customDimensionsParsed.ProcessId,
customCategory = customDimensionsParsed.Category,
customHostInstanceId = customDimensionsParsed.HostInstanceId,
customLogLevel = customDimensionsParsed.LogLevel,
customPipelineStepName = customDimensionsParsed.pipelineStepName,
customParentReportId = customDimensionsParsed.parentReportId,
customChildReportId = customDimensionsParsed.childReportId,
customBLOB_URL = customDimensionsParsed.BLOB_URL,
customBlobUrl = customDimensionsParsed.blobUrl,
customCdProcessId = customDimensionsParsed.ProcessId,
customSender = customDimensionsParsed.sender,
customSubmittedReportIds = customDimensionsParsed.submittedReportIds,
customCdTimestamp = customDimensionsParsed.timestamp,
customTopic = customDimensionsParsed.topic,
customTrackingId = customDimensionsParsed.trackingId
| project
timestamp,
message,
customDimensions,
messageTimestamp,
messageLevel,
messageContent,
messageLogger,
messageMdc,
messageThread,
customProcessId,
customCategory,
customHostInstanceId,
customLogLevel,
customPipelineStepName,
customParentReportId,
customChildReportId,
customBLOB_URL,
customBlobUrl,
customCdProcessId,
customSender,
customSubmittedReportIds,
customCdTimestamp,
customTopic,
customTrackingId
| where customTopic == "etor-ti"
Documented AppInsights KQL queries in RS: https://github.com/CDCgov/prime-reportstream/blob/master/prime-router/docs/observability/azure-events.md
After the most recent update in RS, this is how we can query logs by sender name:
customEvents
| where name == "REPORT_RECEIVED"
| extend params = parse_json(tostring(customDimensions.params))
| where params.senderName == "flexion.etor-service-sender"
By receiver name:
customEvents
| where name == "REPORT_SENT"
| extend params = parse_json(tostring(customDimensions.params))
| where params.receiverName == "la-phl.etor-nbs-orders"
It seems this is where the terraform resource should be added in RS: operations/app/terraform/modules/application_insights/
I'm documenting error scenarios in RS and the queries associated to troubleshooting here
Just one thing I noticed in comparing REPORT_RECEIVED to REPORT_SENT on the etor-ti topic: there's a slight disparity that not routed outcomes and trace errors don't seem to account for.
Past month:
Looking at just Aug 13-15 (midnight-midnight UTC)
I found the error for the message that arrived to RS on 7:39 PM UTC on 8/13: Unexpected scheme: null
. To get the error you can use this query: `traces | where severityLevel > 1
Story
As an Intermediary engineer, so that I can notify CA about any errors that occur, I need a way to identify NBS error that occur during intermediary processing within ReportStream.
Pre-conditions
Acceptance Criteria
Tasks
Engineering
AC Queries
Though this topic is not availble for all log entries. It depends on where the log was generated
Definition of Done
/ig
folder)/adr
folder)README.md
ReportStream Setup
section inREADME.md
Research Questions
Decisions
Notes