triton-inference-server / openvino_backend

OpenVINO backend for Triton.
BSD 3-Clause "New" or "Revised" License
29 stars 16 forks source link

OpenVINO integration improvements #60

Closed dtrawins closed 1 year ago

dtrawins commented 1 year ago

Changed sync inference call with async call – that boost throughput with multi concurrency

Added option to configure more parameters – performance_hint and numeric value of NUM_STREAMS. That way it is possible to tune the performance to the load

Added documentation and example how to configure triton for low and high concurrency load

Added and documented all OV frontends – before only IR format was supported

tanmayv25 commented 1 year ago

Thanks for the contribution and explanation on the question! Can you fill and email us the CLA?

tanmayv25 commented 1 year ago

Received the CLA. Thanks for your contribution!