arcus-azure / arcus-service-to-service-correlation-poc

POC to have end-to-end correlation stitching operations across services together in the Azure Application Insights Application Map.
MIT License
0 stars 3 forks source link

Arcus - Service-to-Service Correlation POC

POC to have end-to-end correlation stitching operations across services together in the Azure Application Insights Application Map.

Arcus

Scope

Provide end-to-end correlation stitching operations across services together in the Azure Application Insights Application Map.

When creating an order, the following flow occurs:

Requirements

Official Telemetry Correlation Guidance

As per the guidance:

Application Insights defines a data model for distributed telemetry correlation. To associate telemetry with a logical operation, every telemetry item has a context field called operation_Id. This identifier is shared by every telemetry item in the distributed trace. So even if you lose telemetry from a single layer, you can still associate telemetry reported by other components.

A distributed logical operation typically consists of a set of smaller operations that are requests processed by one of the components. These operations are defined by request telemetry. Every request telemetry item has its own id that identifies it uniquely and globally. And all telemetry items (such as traces and exceptions) that are associated with the request should set the operation_parentId to the value of the request id.

Every outgoing operation, such as an HTTP call to another component, is represented by dependency telemetry. Dependency telemetry also defines its own id that's globally unique. Request telemetry, initiated by this dependency call, uses this id as its operation_parentId.

You can build a view of the distributed logical operation by using operation_Id, operation_parentId, and request.id with dependency.id. These fields also define the causality order of telemetry calls.

This means that we are handling the operation ID (aka operation_Id) correctly today, but we need to:

Learn more in this example.

Current Status

Here is what the end-to-end correlation across component looks like:

When looking at the telemetry tree, it looks as following:

What is not included

Getting Started

Before you can run this, you need to:

  1. Create an Application Insights resource in your Azure Subscription
  2. Create an Azure Service Bus namespace resource in your Azure Subscription
  3. Create a queue called orders in the Azure Service Bus namespace
  4. Create a docker-compose.override.yml file and set the Application Insights instrumentation key and Service Bus connectionstring
  5. Run solution with Docker Compose by running docker compose up from the folder where the docker-compose.yml file is located
  6. Get bacon by calling the API - GET http://localhost:789/api/v1/bacon
  7. Create order to eat bacon asynchronously by calling the API - POST http://localhost:787/api/v1/market

You can use a tool like Postman to perform API requests, or you can use the Swagger UI page which is available at localhost:787/api/docs for the Market API and at localhost:789/api/docs for the Bacon API.

{
    "amount": 2
}

How Does it Work with Azure Application Insights SDK?

💡 This is currently achieved by using the Azure Application Insights SDK. We will port this to purely TelemetryClient to know where we need to track what.

Here is what the end-to-end correlation across component looks like:

When looking at the telemetry tree, it looks as following:

You can download the raw telemetry here.

Learnings

We can leverage the same capabilities through Serilog if we get inspiration from the Azure Application Insights SDK:

Action items

Some of the action items can be easily found by searching for TODO: Contribute Upstream or using the Task List.

Clarification Required

None at the moment.