feat: adding custom metrics from API service

Context: we are currently handling prometheus metrics via PrometheusClient, which is accessed via BentoMLContainer.metrics_client as a singleton factory. This is currently an internal container, and we intend to keep this only for internal use.

`bentoml.metrics` API

Hence, an external API should expose these metrics client to the user. I propose we should make it bentoml.metrics as our user-facing API

this API is essentially our metrics_client, which should be exposed to all of the prometheus_client API. For example:

import bentoml

my_histogram = bentoml.metrics.Histogram("...", ...)

my_counter = bentoml.metrics.Counter()

This means that when the user uses bentoml.metrics, it should be as seamlessly as if the import prometheus_client, hence the following scenario should also be doable:

from bentoml.metrics import Histogram
from bentoml.metrics.parser import text_to_string_family_metrics

Register custom metrics to given Service

Currently, if users are importing prometheus_client directly in their service, it would fail due since BentoML will have to handle multiprocessing mode to export metrics. This is done by delaying initialization steps for metrics as late as possible.

I'm proposing that the API for registering custom metrics should be synonymous to the ones when we register runners.


import bentoml

runners = bentoml.pytorch.get("my_torch_model:latest").to_runner()

my_histogram = bentoml.metrics.Histogram(name="asdf", bucket=my_bucket)

svc = bentoml.Service("service", runners=[runners], metrics=[my_histogram])

@svc.api(input=...,output=...)
def predict(input):
    my_histogram.labels(...).observe(...)
    # my_logic_here

This means that metrics should also be initialized as late as possible.

bentoml / BentoML