Open saqlainumer-181 opened 1 month ago
@saqlainumer-181, thanks for reporting this. Chroma had a SQLite connection pool leak, which caused many open file handles to linger and increase in number over time, especially under heavy (aka concurrent) load. The issue was addressed in this PR #2014, but it has not yet been released. You can pull the Chroma docker image chromadb/chroma:0.5.6.dev39
or later to get the fix.
Contributor I have used this image of
chromadb/chroma:0.5.6.dev39
, However, when I make concurrent calls to the chromaDB, after all the processings(search in this case) are completed, it still does not free up the memory.
What happened?
I have been using the Httpclient of chroma, and if I send 500 requests in parallel, the chroma server uses the memory during these call but does not free up the memory, after the results are returned? I want to know the possible problems and solutions?
Here is the code:
import chromadb import gc import os from langchain_community.vectorstores import Chroma import requests from concurrent.futures import ThreadPoolExecutor, as_completed import time from langchain_openai import OpenAIEmbeddings import psutil from chromadb.types import SegmentScope
model_name = "text-embedding-ada-002" openai_api_key = my_api_key
EMBEDDINGS = OpenAIEmbeddings(model=model_name, openai_api_key=openai_api_key, request_timeout=30000)
organization_id = my_organization_id customer_id = customer_id
VECTOR_DB_HOST = "127.0.0.1" VECTOR_DB_PORT = "5000" collection_name = "657c87c28f04e8decca92969_66a8f9f5284fdd5763f671a2"
Initialize the shared client outside of the parallel loop
client = chromadb.HttpClient(host=VECTOR_DB_HOST, port=VECTOR_DB_PORT)
Initialize the shared Chroma vector database object
VECTORDB = Chroma(collection_name=collection_name, embedding_function=EMBEDDINGS, client=client, collection_metadata={"hnsw:space": "cosine"})
PROMPT = "write job description of a software engineer"
total_calls = 500 parallel_requests = 100
Function to make the API request
def make_request(i): print(f"Starting call {i}...")
Using ThreadPoolExecutor to handle parallel requests
with ThreadPoolExecutor(max_workers=parallel_requests) as executor: futures = [executor.submit(make_request, i) for i in range(total_calls)] for future in as_completed(futures): pass
print("All results extracted")
Versions
Name: chromadb Version: 0.5.5
Relevant log output
No response