Open ai-jz opened 7 months ago
What's your usecase for these models? Their throughput is so low and the costs so prohibitive that I don't see any.
Ignoring the fact that the quality delta between the top Mistral models the top BERTs might be insignificant in many cases; I see a lot of value in a text embedding inference server outside of it living in the rerank/search hot path. It can provide a ton of value simplifying pipelines that rely on embeddings for clustering.
Feature request
Support the recent larger embedding models of 7B or more parameters (20x larger than BERT-large)
Motivation
The embedding models are being much larger than before in the past few months. For example, Mistral-7B and Mixtrel-8x7B based embedding models are ranking on the top of the leaderboard:
https://huggingface.co/spaces/mteb/leaderboard
Do you plan to support such large embedding models (20x larger than BERT-large) via this repo or the TGI repo?
Your contribution
N/A