py-grpc-prometheus

Instrument library to provide prometheus metrics similar to:

Status

Currently, the library has the parity metrics with the Java and Go library.

Server side:

grpc_server_started_total
grpc_server_handled_total
grpc_server_msg_received_total
grpc_server_msg_sent_total
grpc_server_handling_seconds

Client side:

grpc_client_started_total
grpc_client_handled_total
grpc_client_msg_received_total
grpc_client_msg_sent_total
grpc_client_handling_seconds
grpc_client_msg_recv_handling_seconds
grpc_client_msg_send_handling_seconds

How to use

pip install py-grpc-prometheus

Client side:

Client metrics monitoring is done by intercepting the gPRC channel.

import grpc
from py_grpc_prometheus.prometheus_client_interceptor import PromClientInterceptor

channel = grpc.intercept_channel(grpc.insecure_channel('server:6565'),
                                         PromClientInterceptor())
# Start an end point to expose metrics.
start_http_server(metrics_port)

Server side:

Server metrics are exposed by adding the interceptor when the gRPC server is started. Take a look at tests/integration/hello_world/hello_world_client.py for the complete example.

import grpc
from concurrent import futures
from py_grpc_prometheus.prometheus_server_interceptor import PromServerInterceptor
from prometheus_client import start_http_server

Start the gRPC server with the interceptor, take a look at tests/integration/hello_world/hello_world_server.py for the complete example.

server = grpc.server(futures.ThreadPoolExecutor(max_workers=10),
                         interceptors=(PromServerInterceptor(),))
# Start an end point to expose metrics.
start_http_server(metrics_port)

Histograms

Prometheus histograms are a great way to measure latency distributions of your RPCs. However, since it is bad practice to have metrics of high cardinality the latency monitoring metrics are disabled by default. To enable them please call the following in your interceptor initialization code:

server = grpc.server(futures.ThreadPoolExecutor(max_workers=10),
                     interceptors=(PromServerInterceptor(enable_handling_time_histogram=True),))

After the call completes, its handling time will be recorded in a Prometheus histogram variable grpc_server_handling_seconds. The histogram variable contains three sub-metrics:

grpc_server_handling_seconds_count - the count of all completed RPCs by status and method
grpc_server_handling_seconds_sum - cumulative time of RPCs by status and method, useful for calculating average handling times
grpc_server_handling_seconds_bucket - contains the counts of RPCs by status and method in respective handling-time buckets. These buckets can be used by Prometheus to estimate SLAs (see here)

Server Side:

enable_handling_time_histogram: Enables 'grpc_server_handling_seconds'

Client Side:

enable_client_handling_time_histogram: Enables 'grpc_client_handling_seconds'
enable_client_stream_receive_time_histogram: Enables 'grpc_client_msg_recv_handling_seconds'
enable_client_stream_send_time_histogram: Enables 'grpc_client_msg_send_handling_seconds'

Legacy metrics:

Metric names have been updated to be in line with those from https://github.com/grpc-ecosystem/go-grpc-prometheus.

The legacy metrics are:

server side:

grpc_server_started_total
grpc_server_handled_total
grpc_server_handled_latency_seconds
grpc_server_msg_received_total
grpc_server_msg_sent_total

client side:

grpc_client_started_total
grpc_client_completed
grpc_client_completed_latency_seconds
grpc_client_msg_sent_total
grpc_client_msg_received_total

In order to be able to use these legacy metrics for backwards compatibility, the legacy flag can be set to True when initialising the server/client interceptors

For example, to enable the server side legacy metrics:

server = grpc.server(futures.ThreadPoolExecutor(max_workers=10),
                     interceptors=(PromServerInterceptor(legacy=True),))

How to run and test

make initialize-development
make test

TODO:

Unit test with https://github.com/census-instrumentation/opencensus-python/blob/master/tests/unit/trace/ext/grpc/test_server_interceptor.py

lchenn / py-grpc-prometheus

readme

py-grpc-prometheus

Status

Server side:

Client side:

How to use

Client side:

Server side:

Histograms

Server Side:

Client Side:

Legacy metrics:

server side:

client side:

How to run and test

TODO:

Reference