triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.38k stars 1.49k forks source link

feat: Metrics Support in `tritonfrontend` #7703

Closed KrishnanPrash closed 3 weeks ago

KrishnanPrash commented 1 month ago

What does the PR do?

Adding support for Metrics in tritonfrontend. This involves two components:

With this PR, similar to KServeHttp and KServeGrpc, the metrics service can used with:

import tritonserver
from tritonfrontend import Metrics
server = tritonserver.Server(model_repostory=...).start(wait_until_ready=True)
metrics_service = Metrics(server)
metrics_service.start()
...
metrics_service.stop()

Additional Changes made in this PR:

Test plan:

Added 3 test function to L0_python_api: