chroma-core / chroma

the AI-native open-source embedding database
https://www.trychroma.com/
Apache License 2.0
14.59k stars 1.22k forks source link

[Bug]: Chroma server does not free up the memory #2673

Open saqlainumer-181 opened 1 month ago

saqlainumer-181 commented 1 month ago

What happened?

I have been using the Httpclient of chroma, and if I send 500 requests in parallel, the chroma server uses the memory during these call but does not free up the memory, after the results are returned? I want to know the possible problems and solutions?

Here is the code:

import chromadb import gc import os from langchain_community.vectorstores import Chroma import requests from concurrent.futures import ThreadPoolExecutor, as_completed import time from langchain_openai import OpenAIEmbeddings import psutil from chromadb.types import SegmentScope

model_name = "text-embedding-ada-002" openai_api_key = my_api_key

EMBEDDINGS = OpenAIEmbeddings(model=model_name, openai_api_key=openai_api_key, request_timeout=30000)

organization_id = my_organization_id customer_id = customer_id

VECTOR_DB_HOST = "127.0.0.1" VECTOR_DB_PORT = "5000" collection_name = "657c87c28f04e8decca92969_66a8f9f5284fdd5763f671a2"

Initialize the shared client outside of the parallel loop

client = chromadb.HttpClient(host=VECTOR_DB_HOST, port=VECTOR_DB_PORT)

Initialize the shared Chroma vector database object

VECTORDB = Chroma(collection_name=collection_name, embedding_function=EMBEDDINGS, client=client, collection_metadata={"hnsw:space": "cosine"})

PROMPT = "write job description of a software engineer"

total_calls = 500 parallel_requests = 100

Function to make the API request

def make_request(i): print(f"Starting call {i}...")

try:
    # Use the shared VECTORDB instance for the request
    results = VECTORDB.similarity_search_with_score(PROMPT, k=10)
    if results:
        print(results)

except requests.exceptions.RequestException as e:
    print(f"An error occurred: {e}")

print(f"Finished call {i}.")

Using ThreadPoolExecutor to handle parallel requests

with ThreadPoolExecutor(max_workers=parallel_requests) as executor: futures = [executor.submit(make_request, i) for i in range(total_calls)] for future in as_completed(futures): pass

print("All results extracted")

Versions

Name: chromadb Version: 0.5.5

Relevant log output

No response

tazarov commented 4 weeks ago

@saqlainumer-181, thanks for reporting this. Chroma had a SQLite connection pool leak, which caused many open file handles to linger and increase in number over time, especially under heavy (aka concurrent) load. The issue was addressed in this PR #2014, but it has not yet been released. You can pull the Chroma docker image chromadb/chroma:0.5.6.dev39 or later to get the fix.

saqlainumer-181 commented 4 weeks ago

Contributor I have used this image of chromadb/chroma:0.5.6.dev39, However, when I make concurrent calls to the chromaDB, after all the processings(search in this case) are completed, it still does not free up the memory.