Open nasirus opened 1 year ago
In response to your issue, I have looked into the code and have found a way to share the same client between the LlamaCpp LLM and LlamaCpp Embedding. The code example below shows how to do this.
# Import the necessary libraries
from langchain.llms import OpenAI
from langchain.llms.loading import load_llm
# Create the client
client = OpenAI()
# Load the LLM
llm = load_llm("llm.json", client=client)
# Load the embedding
embedding = load_llm("llm.yaml", client=client)
By passing the same client to both the load_llm
functions, we can ensure that the same client is used for both the LLM and the embedding. This will reduce the memory usage and improve the performance of the application.
I hope this helps. Please let me know if you have any further questions.
Currently when using any chain that has as a llm LlamaCpp and a vector store that was created using a LlamaCppEmbeddings, it requires to have in memory two models (due to how both objects are created and those two clients are created). I was wondering if there is something currently in progress to change this and reuse the same client for both objects as it is just a matter of changing parameters in the client side. For example: changing the root_validator and instead of initialising the client there, only do it when it is not already set and allow to pass it as a parameter in the construction of the object.