chroma-core / chroma

the AI-native open-source embedding database
https://www.trychroma.com/
Apache License 2.0
14.71k stars 1.23k forks source link

[Feature Request]: Llama_Cpp_Python Support for embedding function #2409

Open AveryUALibrary opened 3 months ago

AveryUALibrary commented 3 months ago

Describe the problem

I would like if the collections class had a built-in function for using a local model and llamacpp to embed the documents

Describe the proposed solution

Since I have no idea how to contribute to a open repo, I wish this was a function in the collection embedded functions list:

from chromadb import Documents, EmbeddingFunction, Embeddings
from llama_cpp import Llama
from torch import cuda

class LlamaCppEmbeddingFunction(EmbeddingFunction):
    def __init__(self, model_path: str, **kwargs: Any):
        """
        Initialize the LlamaCppEmbeddingFunction. This function will embed documents using the Llama-CPP-Python library.

        Args:
            model_path (str): Path to the model file.
            kwargs: Additional arguments to pass to the Llama constructor.
                * n_ctx (int): The context size.
                * n_threads (int): The number of cpu threads to use.
                * n_gpu_layers (int): The number of layers to run on the GPU.
        """
        self.model_path = model_path

        # Check if verbose is in kwargs, if not set to False
        if 'verbose' not in kwargs:
            kwargs['verbose'] = False
        # Force embedding to be True
        kwargs['embedding'] = True
        # Check if the computer has a GPU, if not set n_gpu_layers to 0
        if cuda.is_available():
            if 'n_gpu_layers' not in kwargs:
                kwargs['n_gpu_layers'] = 1
        else:
            kwargs['n_gpu_layers'] = 0

        try:
            self.llm_embedding = Llama(model_path, **kwargs)
        except Exception as e:
            raise Exception(f"Error initializing LlamaCppEmbeddingFunction: {e}")

    def __call__(self, input: Documents) -> Embeddings:
        # Create embeddings
        llama_embeddings = [embedding['embedding'] for embedding in self.llm_embedding.create_embedding(list(input))['data']]
        # Convert to numpy array
        llama_embeddings = np.array(llama_embeddings)

        # embed the documents somehow
        return cast(
            Embeddings,
           llama_embeddings.tolist()
        )

Alternatives considered

No response

Importance

i cannot use Chroma without it

Additional Information

The hope is to use it for my vector database like so:

# create client and a new collection
db = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = db.get_or_create_collection("library_chat", embedding_function=LlamaCppEmbeddingFunction(model_path=temp_path, n_ctx=512, n_threads=n_cpu_cores, n_gpu_layers=32))
documents = SimpleDirectoryReader("./text/").load_data()
vector_store = ChromaVectorStore(chroma_collection=chroma_collection) # Create a vector store that uses the ChromaDB collection
storage_context = StorageContext.from_defaults(vector_store=vector_store) # Generates a storage context with default settings
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context # Embeds the documents using the LLM model
)

But I don't think it is currently using the embedding function I specified for it

tazarov commented 3 months ago

@AveryUALibrary, this is nice. Feel free to contribute it as a PR. This technically is already supported via Ollama; however, this removes a few layers of complexity, so I think it is awesome to support it.