Servant-prometheus allows you to record metrics about your servant applications on a per endpoint basis. It uses the information contained in the API's type to produce counters for all endpoints, and adds very little overhead (the included benchmarks show the benchmarked app can sustain 40k req/sec with monitoring, and 41k without, when quantiles are not measured).
In the example below, run time system metrics are also reported on using the prometheus-metrics-ghc package. If using GHC metrics, make sure that your app is run with +RTS -T
to allow your application to have access to the runtime stats.
The library is based on the servant-ekg package, but differs in its preallocation of all meters at app launch, to avoid contention between all endpoint waiting on a single MVar. Due to the design of prometheus-client
, there is still contention between threads responding to the same endpoint but the overhead is minimal, except in the case of Quantiles (Summaries in prometheus-client terminology), see note below.
import Servant.Prometheus
import Servant (serve)
import Network.Wai.Handler.Warp (run)
import Prometheus.Metric.GHC (ghcMetrics)
import Prometheus (register)
...
appAPi :: Proxy AppAPI
appApi = Proxy
app :: Server AppAPI
app = ...
main = do
register ghcMetrics
-- Fork a separate server for serving nothing but metrics,
-- which you will point Prometheus at.
forkIO $ run monitoringPort servePrometheusMetrics
-- Allocate the counters necessary for all app endpoints.
meters <- makeMeters appApi NoQuantiles
-- Run your app with metric monitoring.
run port $ monitorServant appApi meters $ serve appApi app
The benchmarks for this library require the wrk http benchmarking tool to be installed.
It is possible to estimate response time metrics per endpoint (50%, 95% and 99% latency for requests), but there is significant overhead when the response time for your endpoints is low. You should benchmark your application with and without qantiles enabled to decide whether they are worth it for your application. The output below shows the impact that quantiles can impose for very cheap endpoints:
$ bench +RTS -N
Benchmarking servant-prometheus (no quantiles)
Running 20s test @ http://localhost:52649
2 threads and 30 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 658.38us 1.05ms 36.02ms 96.64%
Req/Sec 20.57k 3.57k 28.41k 76.37%
Latency Distribution
50% 494.00us
75% 760.00us
90% 1.11ms
99% 2.95ms
822628 requests in 20.10s, 134.73MB read
Socket errors: connect 0, read 56, write 2, timeout 0
Requests/sec: 40923.58
Transfer/sec: 6.70MB
Benchmarking servant-prometheus (with quantiles)
Running 20s test @ http://localhost:52743
2 threads and 30 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 6.84ms 6.96ms 113.14ms 87.25%
Req/Sec 2.66k 266.29 4.69k 75.50%
Latency Distribution
50% 5.05ms
75% 9.57ms
90% 15.35ms
99% 31.31ms
105890 requests in 20.02s, 17.25MB read
Socket errors: connect 0, read 49, write 15, timeout 0
Requests/sec: 5290.48
Transfer/sec: 0.86MB
Benchmarking without servant-prometheus
Running 20s test @ http://localhost:52841
2 threads and 30 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 690.92us 593.77us 23.68ms 93.29%
Req/Sec 20.77k 2.45k 26.30k 78.50%
Latency Distribution
50% 570.00us
75% 794.00us
90% 1.11ms
99% 2.73ms
826638 requests in 20.00s, 135.38MB read
Socket errors: connect 0, read 45, write 12, timeout 0
Requests/sec: 41322.76
Transfer/sec: 6.77MB
It is possible to decide on an endpoint-by-endpoint basis whether qantiles are enabled - run makeMeters
with NoQuantiles
and then for each endpoint you with to mesture, set its metersRecordQuants
field to WithQuantiles
, and register the quantiles metric with Prometheus:
main = do
meters <- makeMeters appApi NoQuantiles
let enableQantile :: HashMap Text Meters -> Text -> IO HashMap Text Meters
enableQuantil mp ep = case H.lookup ep hm of
Nothing -> fail $ "Unknown endpoint: " ++ show ep
Just m -> do
register (metersTimeQant m)
pure (H.insert ep m{metersRecordQuants=WithQuantiles} mp)
meters' <- foldM enableQuantile meters meters
["servant.path.expensive.:ID.GET"
,"servant.path.veryexpensive.:ID.POST"]
...