run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.63k stars 5.24k forks source link

[Bug]: OpensearchVectorClient is not closing the opensearch client on destruction, nor does it expose close. #12700

Open Panoplos opened 7 months ago

Panoplos commented 7 months ago

Bug Description

Create a OpensearchVectorStore with an OpensearchVectorClient instance, use with index. Delete instances and get the following errors:

Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x17fd00370>
Unclosed connector
connections: ['[(<aiohttp.client_proto.ResponseHandler object at 0x17e1dfe20>, 375268.956332625)]']
connector: <aiohttp.connector.TCPConnector object at 0x17fd00310>

Version

llama-index @0.10.18 + llama-index-vector-stores-opensearch @0.1.7

Steps to Reproduce

See description.

Relevant Logs/Tracbacks

No response

dosubot[bot] commented 7 months ago

Hey @Panoplos! šŸš€ Good to see you back. Hope you've been keeping well amidst all the coding adventures! Let's dive into this and get that bug squashed.

To address the issue with OpensearchVectorClient not closing the OpenSearch client upon destruction, leading to unclosed client sessions and connectors, you can implement a cleanup process using the try...finally pattern. This ensures that the asynchronous cleanup code is properly executed after each operation, which is crucial for preventing resource leaks in an asynchronous environment.

Here's how you can modify your os_store fixture to include cleanup:

@pytest.fixture()
def os_store(index_name: str) -> Generator[OpensearchVectorStore, None, None]:
    client = OpensearchVectorClient(
        endpoint="localhost:9200",
        index=index_name,
        dim=3,
    )
    try:
        yield OpensearchVectorStore(client)
    finally:
        # Ensure cleanup is always executed
        evt_loop.run_until_complete(client._os_client.indices.delete(index=index_name))
        evt_loop.run_until_complete(client._os_client.close())

This ensures that the client sessions are closed and the index is deleted as intended, even if an exception occurs or the test is interrupted.

Regarding the roles within the LlamaIndex framework, OpensearchVectorClient is responsible for direct interactions with an OpenSearch index, including initialization, indexing of vector data, document deletion, and querying. OpensearchVectorStore provides a higher-level abstraction, offering a simplified API for adding, deleting, and querying vector data, integrating OpensearchVectorClient into the LlamaIndex framework's vector store interface.

If you need further assistance or have more questions, feel free to ask.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a šŸ‘ if this is helpful and šŸ‘Ž if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
lulmer commented 5 months ago

I have the same problem, it is anoying

cl3t0 commented 2 months ago

Same bug here.

ulan-yisaev commented 1 month ago

Hi there,

Iā€™m experiencing a similar issue in my project, where I encounter numerous "Unclosed client session" errors in my logs related to aiohttp client sessions:

2024-09-25 09:02:20 [default_exception_handler] ERROR   Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7fd0555fdad0>
2024-09-25 09:02:20 [default_exception_handler] ERROR   Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7fd055628150>
2024-09-25 09:02:20 [default_exception_handler] ERROR   Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7fd05449f250>
2024-09-25 09:02:20 [default_exception_handler] ERROR   Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7fd07c529b50>
2024-09-25 09:02:20 [default_exception_handler] ERROR   Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7fd0566e3a10>
2024-09-25 09:02:20 [default_exception_handler] ERROR   Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7fd056753350>
2024-09-25 09:02:20 [default_exception_handler] ERROR   Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7fd054490d10>
2024-09-25 09:02:20 [default_exception_handler] ERROR   Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7fd0566ffdd0>
2024-09-25 09:02:20 [default_exception_handler] ERROR   Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7fd0540f98d0>
2024-09-25 09:02:20 [default_exception_handler] ERROR   Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7fd047f059d0>
2024-09-25 09:02:20 [default_exception_handler] ERROR   Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7fd047f6b850>

This issue seems to be connected to the way the Opensearch-py client handles async sessions. While switching to synchronous methods removes the error, it significantly impacts the backend's ability to handle concurrent requests efficiently.

I wanted to ask if there are any plans to fix this issue in the llama_index or OpensearchVectorClient implementation? It would be great to know if this is being actively worked on, as it's causing issues in production environments.

Thanks in advance for any updates or suggestions!

ulan-yisaev commented 1 month ago

I think a potential fix might look like this:

Initialize the OpensearchVectorClient like so:

self.opensearch_client = OpensearchVectorClient(endpoint, ...)

When ending the sessions, explicitly close the async client:

await chat_service.opensearch_client._os_async_client.close()

bjmvercelli commented 2 days ago

Any updates?