Closed macaptain closed 4 years ago
Thanks for the review @InoMurko.
I'll close this PR and open a new one. There's too much to fix here in one pass. I'll re-use customize_metadata
, but take a different approach with assigns
for adding metadata to the conn
.
Overview
Because API responses are always 200, even if there's an error message in the body of the response, observability is impacted and it's hard to understand the health of services. This PR adds extra fields to the spans of tags passing through the Watcher and Watcher Info Phoenix based APIs, which flags responses with errors in the trace in Datadog.
While Datadog generates metrics from traces, unfortunately it isn't possible to group the traces by error_type in the metrics count (see https://docs.datadoghq.com/tracing/guide/metrics_namespace/#errors), so we will need to add more instrumentation, counting errors hit with custom metrics. This'll be a different PR.
Changes
What you see in Datadog traces after this change:
This is implemented by using
customize_metadata
on the SpandexPhoenix tracer: https://hexdocs.pm/spandex_phoenix/SpandexPhoenix.html, parsing the JSON response body with Jason (hopefully a relatively cheap operation), and adding the fields to the default metadata.Testing
Make API calls to the endpoints, and check the local-development environment APM in Datadog. Look for services "watcher" and "watcher_info", and check the traces in the DataDog GUI.
P.S.
This is my first Elixir PR. Please let me know how my code can be more idiomatic!