More complex model management (multiple models, model reloading etc...)

bsergean commented 1 month ago

🚀 Feature

Supporting model reloads (when a new version is available) and multiple models.

Motivation

Other servers supports this so to be more attractive that would be a nice feature.

Pitch

Right now it's obvious on how to serve one model, but what if there are multiple ones (and the request (binary, or HTTP arguments) will tell which model should be used).

Alternatives

Run N instances for the N models present at a certain time, but if a new model appear, that won't work.

Additional context

We have an internal C++ server that supports this, torch.serve support that too with I believe what they call an orchestrator.

bsergean commented 1 month ago

Looks like a duplicate of https://github.com/Lightning-AI/LitServe/issues/271

aniketmaurya commented 1 month ago

closing since duplicate of #271

Lightning-AI / LitServe