Dashboard

Filter by tags
Filter by status
Filter Exceptions by options

Monitors

Creating Web request based latency monitors

We will configure our latency SLA to alert us if any request goes over 1.5 seconds

Click Monitors
Click New Monitor
Select trace.rack.request as the metric
Filter by your web service
Aggregate based on avg. We only want average request latency.
Evaluate the average over the last 10 minutes. This will calculate the average over a 10 minute period and alert if over threshold

Set your thresholds for alerting

Notify your team for the specific monitor with details

Creating SLA based background job monitors in Sidekiq

If your application has Service-Level Agreements, you can monitor these and alert when the contract fails. For example, if your background job's queue is matched to an SLA (e.g. within_5_minutes), you can alert if the job latency is outside this threshold. For this example, we will be utilizing Sidekiq as our background job adapter.

First you'll need to instrument latency for sidekiq. If you have Sidekiq Enterprise the following wiki explains configuration. Otherwise you'll need to implement this solution yourself. This article be helpful.
Once you have the latency metric, we can now build a new monitor
Navigate to Monitors
Click create new monitor
Select the metric sidekiq.sidekiq.queue.latency
Filter the "from" by the queue name of queue:within_5_minutes
This metric can only be aggregated by Max
Now because this metric is a gauge returning seconds we need to convert it to minutes in order to set easier alert thresholds.
Click "Add Formula"
Our formula will take the gauge (in seconds) and divide by 60 in order to calculate minutes
Because our SLA for this queue is 5 minutes, we'll evaluate the maximum over the last 5 minutes. We'll alert if the maximum is over 5 minutes in this time period.

Now we can set our alert conditions to alert us if greater than 5 (minutes in this case)

Lastly give this monitor a good message

Tags

Set custom tags for additional Trace data

As your application runs, Datadog tracks incoming traces for the various layers of your application. One way you can improve these is by adding tags.

Tags give you the ability to inject additional information into your traces such as: user id, endpoint name, outgoing request params, etc... These can be utilized for debugging as well as observability.

The following shows off a very simple tracking of the user id:

          if defined?(Datadog::Tracing)
            Datadog::Tracing.active_span&.set_tag("app_context.user.id", current_user.id)
          end

Once this is configured you'll need to wait for Datadog to ingest the new traces from requests. Shortly you'll be able to filter your traces with @app_context.user.id

When filtering its also important to remember the layer of the traces span. Unless you are tagging from the entry span (e.g. endpoint) you'll need to filter traces by "All Spans". The below screenshot illustrates this concept:

Use custom tags as facets within traces

Navigate to Traces
Click Add facet

Select your tag app_context.user.id and give it a Display name
Now you can filter by the new facet in your traces

Find tagged trace data

Click on the trace (this will show the drawer for the resource)
Make sure you are on the "info" tab
Search through Span Tags to find your new section

Configure new metrics based on filters

In Datadog, you can create new metrics based on your current dataset. This can be done by adding filter rules to "Custom Span Metrics" within "Generate Metrics"

Hover on APM
Click Setup & Configuration

Click "Generate Metrics" tab

Add your metric name. This will be what you can search for in your metrics
Define your Query. In our example we filter by the production environment for our API service
You can also group these by your custom tags. For this example, I've grouped by the user id from our previously configured @app_context.user.id

You'll now be able to filter in metrics by this value to see how Users interact with the API

joshmfrankel / joshfrankel.me

Instrumenting your application with Datadog #7