triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.33k stars 1.48k forks source link

How to deploy ensemble models of different versions more elegantly? #7761

Open lzcchl opened 1 week ago

lzcchl commented 1 week ago

Currently, I already know how to deploy different versions of an ordinary model. Suppose we now have two models, namely resnet_v1.pth and resnet_v2.pth. Then, in my model_repo/resnet_pytorch directory, there will be folders "1" and "2", where these two model files are placed respectively. Then, in config.pbtxt, only by setting the parameter "version_policy" to "{ all{} }", these two models can be deployed simultaneously. The deployment log is as follows (Now, there is only one folder in model_repo, named "resnet_pytorch"): +------------------------------------------------------+---------+--------+ | Model | Version | Status | +------------------------------------------------------+---------+--------+ | resnet_pytorch | 1 | READY | | resnet_pytorch | 2 | READY | +------------------------------------------------------+---------+--------+

However, in actual use, we often use ensemble models, such as preprocess+resnet_v1.pth and preprocess+resnet_v2.pth. Currently, my approach is to create two folders under model_repo, called ensemble_resnet_pytorch_v1 and ensemble_resnet_pytorch_v2 respectively. In the two config.pbtxt files, I will set the model_version under ensemble_scheduling to 1 and 2 respectively. The deployment log is as follows (Now, there are four folder in model_repo, named "resnet_pytorch", "resnet_preprocess", "ensemble_resnet_pytorch_v1", "ensemble_resnet_pytorch_v2"): +------------------------------------------------------+---------+--------+ | Model | Version | Status | +------------------------------------------------------+---------+--------+ | ensemble_resnet_pytorch_v1 | 1 | READY | | ensemble_resnet_pytorch_v2 | 1 | READY | | resnet_preprocess | 1 | READY | | resnet_pytorch | 1 | READY | | resnet_pytorch | 2 | READY | +------------------------------------------------------+---------+--------+

But, I think creating two folders("ensemble_resnet_pytorch_v1", "ensemble_resnet_pytorch_v2") for different ensemble models respectively is not elegant enough. Is there a more appropriate way to deploy ensemble models of different versions (only one folder "ensemble_resnet_pytorch" with two version)?

dong-toggle-ai commented 1 week ago

same here, would be good to know the best way to use multiple versions for ensemble models. have you got any idea yet?