awslabs / multi-model-server

Multi Model Server is a tool for serving neural net models for inference
Apache License 2.0
994 stars 231 forks source link

Overriding the model routing logic #1017

Open James-UnlikelyAI opened 1 year ago

James-UnlikelyAI commented 1 year ago

Is it possible to customise how the server decides which model to send a request to?

My use case is that we have lots of models with different versions and I want to automatically always route requests sent to mymodel handle to the latest version. The models are named like mymodel_v1, mymodel_v2, .... mymodel_v9 and are incremented fairly regularly as part of a continuous deployment process.

Is there somewhere in the code that I could override in order to do this routing? On a related note if there is a better way to do this I would love to know! thanks