Open cduk opened 3 months ago
Interesting idea:
/models -> List all current models {"BAAI/bge":""}
/embedding ->Check if "BAAI/bge" is the list of models. Do not deploy dynamically.
/rerank
/state/load -> "jinaai/embed-v2" -> add to models, add max dynamic ones to
/state/unload -> Chan
Idea: Do not add inside /embedding -> That would be a huge mess. Perhaps Drawbacks:
Summary: If this comment gets 10 upvotes, and no futher concerns, I'll build it. Its a heavyweight feature, that I would prefer to move in a separate service.
The simpler way would be not do deal with loading and unloading and require all models fit in VRAM and then you select which one you use in the API call.
So basically add multiple models in the cli at startup?
Exactly!
Instead of running an instance per model in the dockerfile. Can a list of models be provided at instantiation and then the model is chosen via the api request. The current API already has model as a parameter.