omgnetwork / elixir-omg

OMG-Network repository of Watcher and Watcher Info
https://omg.network
Apache License 2.0
213 stars 59 forks source link

feat: add new metadata to Datadog traces #1763

Closed macaptain closed 4 years ago

macaptain commented 4 years ago

Overview

Because API responses are always 200, even if there's an error message in the body of the response, observability is impacted and it's hard to understand the health of services. This PR adds extra fields to the spans of tags passing through the Watcher and Watcher Info Phoenix based APIs, which flags responses with errors in the trace in Datadog.

While Datadog generates metrics from traces, unfortunately it isn't possible to group the traces by error_type in the metrics count (see https://docs.datadoghq.com/tracing/guide/metrics_namespace/#errors), so we will need to add more instrumentation, counting errors hit with custom metrics. This'll be a different PR.

Changes

What you see in Datadog traces after this change:

image

This is implemented by using customize_metadata on the SpandexPhoenix tracer: https://hexdocs.pm/spandex_phoenix/SpandexPhoenix.html, parsing the JSON response body with Jason (hopefully a relatively cheap operation), and adding the fields to the default metadata.

Testing

DD_API_KEY=<some-api-key> docker-compose -f docker-compose-watcher.yml -f docker-compose.dev.yml up

Make API calls to the endpoints, and check the local-development environment APM in Datadog. Look for services "watcher" and "watcher_info", and check the traces in the DataDog GUI.

P.S.

This is my first Elixir PR. Please let me know how my code can be more idiomatic!

macaptain commented 4 years ago

Thanks for the review @InoMurko.

I'll close this PR and open a new one. There's too much to fix here in one pass. I'll re-use customize_metadata, but take a different approach with assigns for adding metadata to the conn.