Xilinx / inference-server

https://xilinx.github.io/inference-server/
Apache License 2.0
43 stars 13 forks source link

Python Backend and Ensemble Scheduling #153

Closed varunsh-xilinx closed 1 year ago

varunsh-xilinx commented 1 year ago
Originally posted by **[dbasbabasi](https://github.com/dbasbabasi)** February 20, 2023 I see we need to do pre and post process on the client side. And I couldn't see any custom python backend and ensemble scheduling for pre and post process on server side like Triton Inference Server. Also, ensemble scheduling helps to create complex model like when we are using multiple model as doing the inference like primary, secondary or parallel inference. It is so important for complex model. When I build the pipeline on Gstreamer or any video decoding system, I need to add custom plugins for inference. So everything gets more complicated on the client side and I can't dynamically create the complex model. If you can support python backend and ensemble scheduling, we can add the complex models with model configuration file. https://github.com/triton-inference-server/server/blob/e9ef15b0fc06d45ceca28861c98b31d0e7f9ee79/docs/user_guide/architecture.md#ensemble-models https://github.com/triton-inference-server/python_backend Best way of the creating dynamic ensemble model is supporting these features like Nvidia.
Originally posted by **[varunsh-xilinx](https://github.com/varunsh-xilinx)** February 22, 2023 Hi, thanks for bringing this up. Running pre- and post-processing on the server side is indeed planned but isn't supported yet. Do you have a public application that you are trying to enable but can't currently? Having a concrete use case helps define the problem and what's needed to address it.