autometrics-dev / autometrics-py

Easily add metrics to your code that actually help you spot and debug issues in production. Built on Prometheus and OpenTelemetry.
https://autometrics.dev
Apache License 2.0
217 stars 7 forks source link
metrics monitoring observability opentelemetry prometheus python telemetry

GitHub_headerImage

Tests Discord Shield

A Python port of the Rust autometrics-rs library

Metrics are a powerful and cost-efficient tool for understanding the health and performance of your code in production. But it's hard to decide what metrics to track and even harder to write queries to understand the data.

Autometrics provides a decorator that makes it trivial to instrument any function with the most useful metrics: request rate, error rate, and latency. It standardizes these metrics and then generates powerful Prometheus queries based on your function details to help you quickly identify and debug issues in production.

See Why Autometrics? for more details on the ideas behind autometrics.

Features

Quickstart

  1. Add autometrics to your project's dependencies:
pip install autometrics
  1. Instrument your functions with the @autometrics decorator
from autometrics import autometrics

@autometrics
def my_function():
  # ...
  1. Configure autometrics by calling the init function:
from autometrics import init

init(tracker="prometheus", service_name="my-service")
  1. Export the metrics for Prometheus
# This example uses FastAPI, but you can use any web framework
from fastapi import FastAPI, Response
from prometheus_client import generate_latest

# Set up a metrics endpoint for Prometheus to scrape
#   `generate_latest` returns metrics data in the Prometheus text format
@app.get("/metrics")
def metrics():
    return Response(generate_latest())
  1. Run Prometheus locally with the Autometrics CLI or configure it manually to scrape your metrics endpoint
# Replace `8080` with the port that your app runs on
am start :8080
  1. (Optional) If you have Grafana, import the Autometrics dashboards for an overview and detailed view of all the function metrics you've collected

Using autometrics-py

from autometrics import autometrics

@autometrics
def sayHello:
  return "hello"

Dashboards

Autometrics provides Grafana dashboards that will work for any project instrumented with the library.

Alerts / SLOs

Autometrics makes it easy to add intelligent alerting to your code, in order to catch increases in the error rate or latency across multiple functions.

from autometrics import autometrics
from autometrics.objectives import Objective, ObjectiveLatency, ObjectivePercentile

# Create an objective for a high success rate
# Here, we want our API to have a success rate of 99.9%
API_SLO_HIGH_SUCCESS = Objective(
    "My API SLO for High Success Rate (99.9%)",
    success_rate=ObjectivePercentile.P99_9,
)

@autometrics(objective=API_SLO_HIGH_SUCCESS)
def api_handler():
  # ...

The library uses the concept of Service-Level Objectives (SLOs) to define the acceptable error rate and latency for groups of functions. Alerts will fire depending on the SLOs you set.

Not sure what SLOs are? Check out our docs for an introduction.

In order to receive alerts, you need to add a special set of rules to your Prometheus setup. These are configured automatically when you use the Autometrics CLI to run Prometheus.

Already running Prometheus yourself? Read about how to load the autometrics alerting rules into Prometheus here.

Once the alerting rules are in Prometheus, you're ready to go.

To use autometrics SLOs and alerts, create one or multiple Objectives based on the function(s) success rate and/or latency, as shown above.

The Objective can be passed as an argument to the autometrics decorator, which will include the given function in that objective.

The example above used a success rate objective. (I.e., we wanted to be alerted when the error rate started to increase.)

You can also create an objective for the latency of your functions like so:

from autometrics import autometrics
from autometrics.objectives import Objective, ObjectiveLatency, ObjectivePercentile

# Create an objective for low latency
#   - Functions with this objective should have a 99th percentile latency of less than 250ms
API_SLO_LOW_LATENCY = Objective(
    "My API SLO for Low Latency (99th percentile < 250ms)",
    latency=(ObjectiveLatency.Ms250, ObjectivePercentile.P99),
)

@autometrics(objective=API_SLO_LOW_LATENCY)
def api_handler():
  # ...

The caller Label

Autometrics keeps track of instrumented functions that call each other. So, if you have a function get_users that calls another function db.query, then the metrics for latter will include a label caller="get_users".

This allows you to drill down into the metrics for functions that are called by your instrumented functions, provided both of those functions are decorated with @autometrics.

In the example above, this means that you could investigate the latency of the database queries that get_users makes, which is rather useful.

Settings and Configuration

Autometrics makes use of a number of environment variables to configure its behavior. All of them are also configurable with keyword arguments to the init function.

Below is an example of initializing autometrics with build information, as well as the prometheus tracker. (Note that you can also accomplish the same confiugration with environment variables.)

from autometrics import autometrics, init
from git_utils import get_git_commit, get_git_branch

VERSION = "0.0.1"

init(
  tracker="prometheus",
  version=VERSION,
  commit=get_git_commit(),
  branch=get_git_branch()
)

Identifying commits that introduced problems

Autometrics makes it easy to identify if a specific version or commit introduced errors or increased latencies.

NOTE - As of writing, build_info will not work correctly when using the default setting of AUTOMETRICS_TRACKER=opentelemetry. If you wish to use build_info, you must use the prometheus tracker instead (AUTOMETRICS_TRACKER=prometheus).

The issue will be fixed once the following PR is merged and released on the opentelemetry-python project: https://github.com/open-telemetry/opentelemetry-python/pull/3306

autometrics-py will track support for build_info using the OpenTelemetry tracker via this issue

The library uses a separate metric (build_info) to track the version and git metadata of your code - repository url, provider name, commit and branch.

It then writes queries that group metrics by these metadata, so you can spot correlations between code changes and potential issues.

Configure these labels by setting the following environment variables:

Label Run-Time Environment Variables Default value
version AUTOMETRICS_VERSION ""
commit AUTOMETRICS_COMMIT or COMMIT_SHA ""
branch AUTOMETRICS_BRANCH or BRANCH_NAME ""
repository_url AUTOMETRICS_REPOSITORY_URL ""*
repository_provider AUTOMETRICS_REPOSITORY_PROVIDER ""*

* Autometrics will attempt to automagically infer these values from the git config inside your working directory. To disable this behavior, explicitly set the corresponding setting or environment variable to "".

This follows the method outlined in Exposing the software version to Prometheus.

Service name

All metrics produced by Autometrics have a label called service.name (or service_name when exported to Prometheus) attached, in order to identify the logical service they are part of.

You may want to override the default service name, for example if you are running multiple instances of the same code base as separate services, and you want to differentiate between the metrics produced by each one.

The service name is loaded from the following environment variables, in this order:

  1. AUTOMETRICS_SERVICE_NAME (at runtime)
  2. OTEL_SERVICE_NAME (at runtime)
  3. First part of __package__ (at runtime)

Exemplars

NOTE - As of writing, exemplars aren't supported by the default tracker (AUTOMETRICS_TRACKER=opentelemetry). You can track the progress of this feature here: https://github.com/autometrics-dev/autometrics-py/issues/41

Exemplars are a way to associate a metric sample to a trace by attaching trace_id and span_id to it. You can then use this information to jump from a metric to a trace in your tracing system (for example Jaeger). If you have an OpenTelemetry tracer configured, autometrics will automatically pick up the current span from it.

To use exemplars, you need to first switch to a tracker that supports them by setting AUTOMETRICS_TRACKER=prometheus and enable exemplar collection by setting AUTOMETRICS_EXEMPLARS=true. You also need to enable exemplars in Prometheus by launching Prometheus with the --enable-feature=exemplar-storage flag.

Exporting metrics

There are multiple ways to export metrics from your application, depending on your setup. You can see examples of how to do this in the examples/export_metrics directory.

If you want to export metrics to Prometheus, you have two options in case of both opentelemetry and prometheus trackers:

  1. Create a route inside your app and respond with generate_latest()
# This example uses FastAPI, but you can use any web framework
from fastapi import FastAPI, Response
from prometheus_client import generate_latest

# Set up a metrics endpoint for Prometheus to scrape
@app.get("/metrics")
def metrics():
    return Response(generate_latest())
  1. Specify prometheus as the exporter type, and a separate server will be started to expose metrics from your app:
exporter = {
    "type": "prometheus",
    "address": "localhost",
    "port": 9464
}
init(tracker="prometheus", service_name="my-service", exporter=exporter)

For the OpenTelemetry tracker, you have more options, including a custom metric reader. You can specify the exporter type to be otlp-proto-http or otlp-proto-grpc, and metrics will be exported to a remote OpenTelemetry collector via the specified protocol. You will need to install the respective extra dependency in order for this to work, which you can do when you install autometrics:

pip install autometrics[exporter-otlp-proto-http]
pip install autometrics[exporter-otlp-proto-grpc]

After installing it you can configure the exporter as follows:

exporter = {
    "type": "otlp-proto-grpc",
    "address": "http://localhost:4317",
    "insecure": True
}
init(tracker="opentelemetry", service_name="my-service", exporter=exporter)

To use a custom metric reader you can specify the exporter type to be otel-custom and provide a custom metric reader:

my_custom_metric_reader = PrometheusMetricReader("")
exporter = {
    "type": "otel-custom",
    "reader": my_custom_metric_reader
}
init(tracker="opentelemetry", service_name="my-service", exporter=exporter)

Development of the package

This package uses poetry as a package manager, with all dependencies separated into three groups:

By default, poetry will only install required dependencies, if you want to run examples, install using this command:

poetry install --with examples

Code in this repository is:

In order to run these tools locally you have to install them, you can install them using poetry:

poetry install --with dev --all-extras

After that you can run the tools individually

# Formatting using black
poetry run black .
# Lint using mypy
poetry run mypy .
# Run the tests using pytest
poetry run pytest
# Run a single test, and clear the cache
poetry run pytest --cache-clear -k test_tracker