Support for Text Embedding Inference (TEI) from HuggingFace

Steinkreis commented 8 months ago

Feature request

Similar to Text Generation Inference (TGI) for LLMs, HuggingFace created an inference server for text embeddings models called Text Embedding Inference (TEI). See: https://github.com/huggingface/text-embeddings-inference Could you integrate TEI into the supported LangChain text embedding models or do you guys already plan to do this?

Motivation

We currently develop a rag based chat app and plan to deploy the components as microservices (LLM, DB, Embedding Model). Currently the only other suitable solution for us would be to use SagemakerEndpointEmbeddings. However being able to use TEI would be a great benefit.

Your contribution

I work as an ML Engineer and could probably assist in some way if necessary.

dosubot[bot] commented 8 months ago

🤖

Hello @Steinkreis,

I'm Dosu, a bot designed to assist with the LangChain repository. I'm here to help you navigate through bugs, answer your questions, and guide you as a contributor. While we wait for a human maintainer, feel free to ask me anything you need. Let's make your experience with LangChain as smooth as possible!

It seems like you've got this under control, if you want help or have specific questions, let me know what I can do for you!

edwardzjl commented 8 months ago

I made a minimal working example, if anyone want to use before langchain supports tei:

import requests

from langchain_core.embeddings import Embeddings

from pydantic import BaseModel

class HuggingfaceTEIEmbeddings(BaseModel, Embeddings):
    """See <https://huggingface.github.io/text-embeddings-inference/>"""
    base_url: str
    normalize: bool = True
    truncate: bool = False
    query_instruction: str
    """Instruction to use for embedding query."""

    def embed_documents(self, texts: list[str]) -> list[list[float]]:
        response = requests.post(
            self.base_url + "/embed",
            json={
                "inputs": texts,
                "normalize": self.normalize,
                "truncate": self.truncate,
            },
        )
        return response.json()

    def embed_query(self, text: str) -> list[float]:
        instructed_query = self.query_instruction + text
        return self.embed_documents([instructed_query])[0]

youyuanrsq commented 6 months ago

I have a problem when using HuggingFaceHubEmbeddings to call the local TEI endpoint. The input parameter of TEI's embed API is { "inputs": "string", "normalize": true, "truncate": false } but the final parameter passed is { "inputs": "string", "parameters": _model_kwargs } , This causes the embed API to only receive the inputs parameters and ignore the other two parameters. I modified the embed_documents function as follows, which can correctly pass parameters to the embed API:

    def embed_documents(self, texts: List[str]) -> List[List[float]]:
        """Call out to HuggingFaceHub's embedding endpoint for embedding search docs.

        Args:
            texts: The list of texts to embed.

        Returns:
            List of embeddings, one for each text.
        """
        # replace newlines, which can negatively affect performance.
        texts = [text.replace("\n", " ") for text in texts]
        _model_kwargs = self.model_kwargs or {}

        responses = self.client.post(
            json={"inputs": texts} | _model_kwargs,
            task=self.task,
        )
        return json.loads(responses.decode())

langchain-ai / langchain