Lightning-AI / LitServe

Lightning-fast serving engine for any AI model of any size. Flexible. Easy. Enterprise-scale.
https://lightning.ai/docs/litserve
Apache License 2.0
2.46k stars 155 forks source link

More complex model management (multiple models, model reloading etc...) #282

Closed bsergean closed 1 month ago

bsergean commented 1 month ago

🚀 Feature

Supporting model reloads (when a new version is available) and multiple models.

Motivation

Other servers supports this so to be more attractive that would be a nice feature.

Pitch

Right now it's obvious on how to serve one model, but what if there are multiple ones (and the request (binary, or HTTP arguments) will tell which model should be used).

Alternatives

Run N instances for the N models present at a certain time, but if a new model appear, that won't work.

Additional context

We have an internal C++ server that supports this, torch.serve support that too with I believe what they call an orchestrator.

bsergean commented 1 month ago

Looks like a duplicate of https://github.com/Lightning-AI/LitServe/issues/271

aniketmaurya commented 1 month ago

closing since duplicate of #271