triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.35k stars 1.49k forks source link

Yolo in the Cloud #4907

Closed alexkehoe closed 1 year ago

alexkehoe commented 2 years ago

Hi we have a yolov model that we've converted to tensorRT for inference on the edge. We'd like to port this model to the cloud so we can process +100 streams. Can triton be used to run such a model in the cloud? Ie we would send an RTSP or WebRTC video stream to triton and then send just the bounding box results back to our local server for display to the user? FPS would be say 10hz

If Triton is not appropriate for this use case, can you recommend another way of trying to scale deployment of a yolo model for +100 video streams?

jbkyang-nvi commented 2 years ago

Hi, please read the quickstart guide to see details for the Tritonserver.

I believe there are users that have used yolo models with Triton. We require the data to be sent from Triton clients. Please read up about the Triton Architecture.

https://github.com/triton-inference-server/client#triton-client-libraries-and-examples has some more examples you can refer to

OctaM commented 2 years ago

@jbkyang-nvi all those examples are using image files or image folders. Are there any examples with video streams?

tanmayv25 commented 2 years ago

Similar question: https://github.com/triton-inference-server/server/issues/4487

jbkyang-nvi commented 2 years ago

https://developer.nvidia.com/deepstream-sdk to learn more about deepstream

jbkyang-nvi commented 1 year ago

Closing issue due to lack of activity. Please re-open the issue if you would like to follow up with this issue