Closed onazaruk closed 4 days ago
The issue you're encountering with the increasing number of asyncio_*
threads is likely due to the way asynchronous tasks are being handled within your web service, especially when using blocking calls like asyncio.get_event_loop().run_until_complete()
in a synchronous environment. This can lead to resource exhaustion over time as new event loops and threads may be created for each operation.
To address this, consider refactoring your web service to fully embrace asynchronous operations. This involves using async def
for your endpoint functions and directly awaiting the asynchronous operations (e.g., elasticsearch_store.async_add(documents)
) without resorting to blocking calls. This approach is supported by modern Python web frameworks such as FastAPI, Starlette, or Sanic, which allow for asynchronous request handlers.
Here's a simplified example for a FastAPI application:
from fastapi import FastAPI, HTTPException
from your_application import ElasticsearchStore # Adjust the import path as necessary
from typing import List
app = FastAPI()
elasticsearch_store = ElasticsearchStore(...) # Initialize your ElasticsearchStore
@app.post("/add_documents")
async def add_documents(documents: List[YourDocumentType]): # Replace YourDocumentType with your actual document type
try:
ids = await elasticsearch_store.async_add(documents)
return {"ids": ids}
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
This change ensures that your web service properly utilizes asynchronous I/O operations, reducing the overhead associated with thread management and potentially resolving the issue with the increasing number of asyncio_*
threads.
Regarding the VectorStoreIndex.from_documents()
method in the LlamaIndex library, it's important to note that asynchronous operations, when handled correctly, do not inherently create new asyncio
threads. Instead, they schedule tasks on the event loop, allowing for concurrent execution without the overhead of thread creation. Ensure that your usage of LlamaIndex or any other library follows this asynchronous pattern to avoid similar issues.
Bug Description
My data ingestion tool is running as a web-service and data ingestion is triggered occasionally. I noticed that the number of
asyncio_*
threads is increasing with eachVectorStoreIndex.from_documents()
call. The storage_context isElasticsearchStore
.After couple of index creation iterations
threading.enumerate()
returns:Is this a know issue? Can it be fixed?
Elasticsearch lib version is 8.12.1 Llama Index version is 0.10.20
Version
0.10.20
Steps to Reproduce
On a webserver (e.g. flask): 1) Create an endpoint that creates index from Elasticsearch storage context with
VectorStoreIndex.from_documents(docs, storage_context=elastic_store, use_async=True)
2) Addprint(", ".join([t.name for t in threading.enumerate() if t.startswith("asyncio")])
to the end of endpoint execution 3) Trigger webservice endpoint a couple of times 4) Check console output to see the last string of theprint
output with the list of active threadsRelevant Logs/Tracbacks
No response