Closed Steinkreis closed 2 months ago
🤖
Hello @Steinkreis,
I'm Dosu, a bot designed to assist with the LangChain repository. I'm here to help you navigate through bugs, answer your questions, and guide you as a contributor. While we wait for a human maintainer, feel free to ask me anything you need. Let's make your experience with LangChain as smooth as possible!
It seems like you've got this under control, if you want help or have specific questions, let me know what I can do for you!
I made a minimal working example, if anyone want to use before langchain supports tei:
import requests
from langchain_core.embeddings import Embeddings
from pydantic import BaseModel
class HuggingfaceTEIEmbeddings(BaseModel, Embeddings):
"""See <https://huggingface.github.io/text-embeddings-inference/>"""
base_url: str
normalize: bool = True
truncate: bool = False
query_instruction: str
"""Instruction to use for embedding query."""
def embed_documents(self, texts: list[str]) -> list[list[float]]:
response = requests.post(
self.base_url + "/embed",
json={
"inputs": texts,
"normalize": self.normalize,
"truncate": self.truncate,
},
)
return response.json()
def embed_query(self, text: str) -> list[float]:
instructed_query = self.query_instruction + text
return self.embed_documents([instructed_query])[0]
I have a problem when using HuggingFaceHubEmbeddings
to call the local TEI endpoint. The input parameter of TEI's embed
API is
{ "inputs": "string", "normalize": true, "truncate": false }
but the final parameter passed is
{ "inputs": "string", "parameters": _model_kwargs }
, This causes the embed
API to only receive the inputs
parameters and ignore the other two parameters. I modified the embed_documents
function as follows, which can correctly pass parameters to the embed
API:
def embed_documents(self, texts: List[str]) -> List[List[float]]:
"""Call out to HuggingFaceHub's embedding endpoint for embedding search docs.
Args:
texts: The list of texts to embed.
Returns:
List of embeddings, one for each text.
"""
# replace newlines, which can negatively affect performance.
texts = [text.replace("\n", " ") for text in texts]
_model_kwargs = self.model_kwargs or {}
responses = self.client.post(
json={"inputs": texts} | _model_kwargs,
task=self.task,
)
return json.loads(responses.decode())
Feature request
Similar to Text Generation Inference (TGI) for LLMs, HuggingFace created an inference server for text embeddings models called Text Embedding Inference (TEI). See: https://github.com/huggingface/text-embeddings-inference Could you integrate TEI into the supported LangChain text embedding models or do you guys already plan to do this?
Motivation
We currently develop a rag based chat app and plan to deploy the components as microservices (LLM, DB, Embedding Model). Currently the only other suitable solution for us would be to use SagemakerEndpointEmbeddings. However being able to use TEI would be a great benefit.
Your contribution
I work as an ML Engineer and could probably assist in some way if necessary.