feat: Metrics Support in `tritonfrontend` - Githubissues

triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html

BSD 3-Clause "New" or "Revised" License

8.38k stars 1.49k forks source link

feat: Metrics Support in `tritonfrontend` #7703

Closed KrishnanPrash closed 3 weeks ago

KrishnanPrash commented 1 month ago

What does the PR do?

Adding support for Metrics in tritonfrontend. This involves two components:

In tritonfrontend_pybind.cc, added bindings for HTTPMetricsServer
In tritonfrontend/_api/_metrics.py, added a Metrics class

With this PR, similar to KServeHttp and KServeGrpc, the metrics service can used with:

import tritonserver
from tritonfrontend import Metrics
server = tritonserver.Server(model_repostory=...).start(wait_until_ready=True)
metrics_service = Metrics(server)
metrics_service.start()
...
metrics_service.stop()

Additional Changes made in this PR:

Modified test functions documentation based on this comment
Removed extra parameter in request.post(...) based on this comment

Test plan:

Added 3 test function to L0_python_api:

test_metrics_default_port() : Tests whether the metrics service can start as expected
test_metrics_custom_port(): Tests whether arguments defined in tritonfrontend.Metrics.Options are passed successfully to HTTPMetrics
test_metrics_update(): Tests whether nv_inference_count value goes from 0 to 1 if inference request is performed.
CI Pipeline ID: 19197748