isarsoft / yolov4-triton-tensorrt

This repository deploys YOLOv4 as an optimized TensorRT engine to Triton Inference Server
http://www.isarsoft.com
Other
277 stars 63 forks source link

Kubernetes #12

Closed ontheway16 closed 3 years ago

ontheway16 commented 3 years ago

Hello,

I am looking for necessary steps to use this repo under kubernetes. Do you think just replacing the repo name is enough, or what additional steps may be required? Appreciate any high overviews.

philipp-schmidt commented 3 years ago

You need to build a docker image on top of the available python docker images, clone the repo in it and install the dependencies like mentioned under clients/python. Then you can set the client script to run when the docker container is started (or write your own logic) and of course use the hostname of the inference server in your networking setup (that's up to you to make the actual network overlay work).

Regarding Kubernetes specifics I can not help much, because this very much depends on your setup. We run this setup successfully in Kubernetes as well as Docker-Compose setups and had no problems so far.

ontheway16 commented 3 years ago

Thanks for replying. I am looking to following link to figure it out; https://github.com/triton-inference-server/server/tree/master/deploy/single_server But cannot find a way to store models in a locally set persistent volume, instead of google cloud. Is it possible at all?

philipp-schmidt commented 3 years ago

There are a variety of ways to share a docker volume between containers. The simplest one is having a single volume mounted in two containers which should solve your problem.