Open hamasurrehman opened 8 months ago
@hamasurrehman, thank you for the elaborate description of your problem and possible alternatives. We do have a PR on this #1792. I would appreciate your feedback on it.
I'm a little confused about the use-case here. I see two possible ways for this to be used:
close()
on an HttpClient
, keep a reference to the closed client somewhere, and later use it.close()
on an HttpClient
then let it go out of scope so it gets garbage collected.I don't see much utility in the second case: when the object gets garbage collected it'll close all connections, and that happens relatively quickly.
The first case is interesting, though. Chroma server currently defaults to Http/1.1 which has a default keepalive
of 30 seconds. This means if you're creating a lot of HttpClient
s which only send a few requests each you do end up with a lot of port usage hanging around. We could plumb in a close()
method to close all connections in the client's threadpool. Is this what you meant @hamasurrehman? Could you describe the resource limitations you're bumping up against?
@beggers, how about proxies? Can they keep the connections open?
A user in Discord reported this:
tcp 1 0 testing00-alfredg:36892 staging-chromadb.i:8002 CLOSE_WAIT off (0.00/0/0)
tcp 1 0 testing00-alfredg:36412 staging-chromadb.i:8002 CLOSE_WAIT off (0.00/0/0)
I think calling session.close()
could be an effective way for people to indicate that they want to release resources properly.
On top of that, we can provide a utility context manager.
Describe the problem
Currently, the
chromadb.HttpClient
class lacks a method for explicitly closing the connection. This omission poses challenges for users in managing resources efficiently and ensuring proper cleanup after using the client. Without a designated way to close connections, users may encounter resource leaks or inefficient resource utilization, particularly in scenarios involving long-lived applications or numerous concurrent connections.Describe the proposed solution
I propose augmenting the
chromadb.HttpClient
class with aclose()
function to facilitate explicit closure of connections. This method would empower users to responsibly manage resources by enabling them to explicitly release connections when they are no longer needed. By integrating aclose()
function, users can uphold best practices for resource management in Python and mitigate potential issues associated with lingering connections.Alternatives considered
Automatic Connection Management: Rather than introducing a
close()
function, an alternative approach could involve implementing automatic connection management within thechromadb.HttpClient
. This would entail the client automatically closing connections after a certain period of inactivity or when they reach a predefined threshold. However, this approach might lack flexibility and could potentially lead to unexpected connection closures, especially in scenarios where users require precise control over connection lifetimes.Context Manager Support: Another option could be to add support for the context manager protocol (
__enter__
and__exit__
methods) to thechromadb.HttpClient
. By implementing this protocol, users could utilize thewith
statement to ensure proper resource cleanup, as connections would be automatically closed upon exiting thewith
block. While this approach offers convenience, it may not fully address cases where users need to explicitly manage connection lifetimes outside of awith
context.Manual Connection Management: Alternatively, users could manually manage connection lifetimes by explicitly calling a
disconnect()
orrelease()
method on thechromadb.HttpClient
to close connections. While this approach provides control over resource cleanup, it relies heavily on user diligence and may increase the risk of overlooking necessary cleanup steps, leading to potential resource leaks or inefficiencies.Importance
i cannot use Chroma without it
Additional Information
This feature would enhance the usability and robustness of the ChromaDB client library, providing users with more control over resource management. It would also contribute to better adherence to Pythonic conventions and standards for database client libraries.