Closed prasannakrish97 closed 4 months ago
What's your usecase for these models? Their throughput is so low and the costs so prohibitive that I don't see any.
The usecases are :
Since the varieties of the following uses-cases, it seems that SFR-Embedding-Mistral, can be a single solution thanks to its versatility and ranked no.1 in huggingface mteb.
Let's suppose that TEI supports this model, the aim was to bench this model in CPU & GPU to have an idea of inference time and compare with other models.
I can already tell you that inference times will be multiple orders of magnitude worse.
@OlivierDehaene The performance of a large model like Mistral is prohibitive for most embedding purposes (especially at scale). However, the quality of the embeddings makes it appealing for niche situations, especially in situations when performance isn't as crucial (for example, it's used in a pipeline doing clustering, rather than fast RAG search).
Having an inference server that supports it makes things easier. When I've used these Mistral embedding models I've used a quick and dirty fix in candle https://github.com/huggingface/candle/pull/1636 to get things going.
Besides not seeing value in this use case, is there anything else you view as problematic about supporting this model architecture?
+1 for this feature request. My use case requires high-quality embeddings for agent memories that are periodically generated. Very similar to section 4.2 from this paper.
TEI is a good choice for my research because is well-supported from a client standpoint. Moreover if I understand the performance tradeoff correctly it would be similar to normal inferencing and - as was mentioned - is not prohibitive for some use cases.
I completely agree with you @theobjectivedad & @functorism : TEI is really a good choice to evaluate & to have an idea of the quality of the model / factor of performance to your specific use-case. It would be really awesome if TEI supports this model or even widely mistral family embeddings models.
Alternatively, I'm using infinity which supports very well SFR-Embedding-Mistral model :-) (cf. https://github.com/michaelfeil/infinity)
OK I'm coming back to this issue and I will add it soon. @prasannakrish97 please don't mention other OSS projects here that's bad etiquette.
@OlivierDehaene Hello, does TEI support SFR-Embedding Mistral(https://huggingface.co/Salesforce/SFR-Embedding-Mistral)now? I also have this demand.
Model description
It would have been awesome if TEI supports SFR-Embedding-Mistral, which figures on the top of the mteb : https://huggingface.co/Salesforce/SFR-Embedding-Mistral
Open source status
Provide useful links for the implementation
https://huggingface.co/Salesforce/SFR-Embedding-Mistral