Open vinvcn opened 1 year ago
In the entire cache operation, it is not the db access that takes the most time, but the operation of embedding and other models. But for asynchrony, this is indeed an optimization point, and we are currently planning
I would also be very keen on using the async version. Currently my chatbot uses the chain.arun() calls, it'll be great that LangChainLLMs from the gptcache.adapter.langchain_models support acall(). Is there an ETA for this feature?
What would you like to be added?
Hi, I did a brief study on the code base we have here. Seems the most part that involves external I/O is not async supported?
Given that the CPython implementation enforces the GIL, the main thread that runs the script will be blocked by these I/O and drastically reduces the concurrency. To make it worse for the users of this library, any calls to retrieve cached results will have to be blocked here, no matter how the user has optimized their code to use async calls.
Why is this needed?
The I/O constitute the major part of this library, and python only has one thread running. It is important to use async IMHO.
Anything else?
Refer to using async redis client as an example:
https://redis.com/blog/async-await-programming-basics-python-examples/