triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.39k stars 1.49k forks source link

feat: KServe Bindings to start tritonfrontend #7662

Closed KrishnanPrash closed 2 months ago

KrishnanPrash commented 2 months ago

Adding support to start the KServeHttp and KServeGrpc frontends from the server/python/openai/openai_frontend/main.py

rmccorm4 commented 2 months ago

LGTM - but can you update the README.md to replace port 8000 with port 9000 for OpenAI call examples?