memory and ConversationalRetrievalChain.from_llm how to share the same LLM, in the loop chat content to achieve short-term memory function, please help check the loop and short-term memory ...

Issue with current documentation:

import os import qdrant_client from dotenv import load_dotenv from langchain.chains import ConversationalRetrievalChain from langchain.memory import ConversationBufferMemory from langchain.vectorstores import Qdrant from langchain_community.chat_models import ChatOpenAI from langchain_community.embeddings import HuggingFaceBgeEmbeddings import langchain from langchain_community.vectorstores import Qdrant from langchain.prompts import ( ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate, ) from langchain_core.documents import Document

langchain.verbose = True langchain.debug = True os.environ['OPENAI_API_KEY'] = "mykey"

def get_vector_store(): client = qdrant_client.QdrantClient( os.getenv('QDRANT_HOST'), )

embeddings = HuggingFaceBgeEmbeddings(
    model_name="BAAI/bge-large-zh-v1.5",
)
vector_store = Qdrant(
    client=client,
    collection_name=os.getenv('QDRANT_COLLECTION_NAME'),
    embeddings=embeddings,
)
return vector_store

system_message_prompt = SystemMessagePromptTemplate.from_template( "妳名為阿狗，原先為創世神，但遭受偷襲受到重創，利用僅存的菲利斯多之力重塑形體後被凱琳拯救。妳的個性膽小、怕生、害羞，容易緊張，身體狀態虛弱。回話時會習慣用「唔...」、「嗯...」、「咦....」等語助詞表達自己的情緒，在對話中，我是妳的對話者，請記住我的提問給出相關答覆， The context is:\n{context}" )

human_message_prompt = HumanMessagePromptTemplate.from_template( "{question}" )

def get_chat_history(inputs) -> str: res = [] for human, ai in inputs: res.append(f"Human:{human}\nAI:{ai}") return "\n".join(res)

def main(): load_dotenv() vectorstore = get_vector_store() qa = ConversationalRetrievalChain.from_llm( llm=ChatOpenAI( temperature=0.7, max_tokens=100, model=os.getenv('QDRANT_MODEL_NAME'), ), chain_type="stuff", retriever=vectorstore.as_retriever( search_type="similarity_score_threshold", search_kwargs={"score_threshold": 0.7, "k": 128}, collection_name=os.getenv('QDRANT_COLLECTION_NAME'), get_chat_history=get_chat_history, search_index_kwargs={ "hnsw_ef": 32, "hnsw_m": 16, "indextime budget": 500, "stop_words": "結束 bye 幹".split(), "batch_size": 128, "chunk_size:": 128, "chunk_overlap": 32, } ), memory=ConversationBufferMemory(memory_key="chat_history", input_key="question", return_messages=True, k=3), combine_docs_chain_kwargs={ "prompt": ChatPromptTemplate.from_messages([ system_message_prompt, human_message_prompt, ]), }, ) chat_history = [] while True: qa.load_memory_variables({"chat_history": chat_history}) query = input("冒險者: ") result = qa({"question": query}, ) chat_history.append(result) print(result["answer"]) document = Document(page_content=query, metadata={'source': 'user'}) vectorstore.add_documents([document]) print(f'儲存的歷史紀錄:\n\n{chat_history}') if query == "bye": break

if name == "main": main()

執行結果: Traceback (most recent call last): File "C:\Users\sys\Downloads\Qdrant\new.py", line 107, in main() File "C:\Users\sys\Downloads\Qdrant\new.py", line 94, in main qa.load_memory_variables({"chat_history": chat_history}) ^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'ConversationalRetrievalChain' object has no attribute 'load_memory_variables'

Please help me check if I can read the chat history in the loop

Idea or request for content:

Achieve short-term memory and long-term memory at the same time (vector retrieval function)

🤖

你好 @yen111445，很高兴再次看到你的问题。希望你一切都好！

Based on the error message you provided, it seems like the ConversationalRetrievalChain object does not have a method called load_memory_variables. This method is not a part of the ConversationalRetrievalChain class in the LangChain framework.

The load_memory_variables method is part of the ConversationSummaryBufferMemory class in LangChain, which is used to return the history buffer. It seems like you're trying to use this method with a ConversationalRetrievalChain object, which is causing the error.

If you want to load memory variables into the ConversationalRetrievalChain, you should do so when you're initializing the ConversationalRetrievalChain object. The ConversationalRetrievalChain class takes a memory argument in its constructor, which should be an instance of a class that implements the BaseMemory interface. In your case, you're using the ConversationBufferMemory class.

Here's how you can initialize the ConversationalRetrievalChain with a ConversationBufferMemory instance:

qa = ConversationalRetrievalChain.from_llm(
    llm=ChatOpenAI(
        temperature=0.7,
        max_tokens=100,
        model=os.getenv('QDRANT_MODEL_NAME'),
    ),
    chain_type="stuff",
    retriever=vectorstore.as_retriever(
        search_type="similarity_score_threshold",
        search_kwargs={"score_threshold": 0.7, "k": 128},
        collection_name=os.getenv('QDRANT_COLLECTION_NAME'),
        get_chat_history=get_chat_history,
        search_index_kwargs={
            "hnsw_ef": 32,
            "hnsw_m": 16,
            "index_time_ budget": 500,
            "stop_words": "結束 bye 幹".split(),
            "batch_size": 128,
            "chunk_size:": 128,
            "chunk_overlap": 32,
        }
    ),
    memory=ConversationBufferMemory(memory_key="chat_history", input_key="question", return_messages=True, k=3),
    combine_docs_chain_kwargs={
        "prompt": ChatPromptTemplate.from_messages([
            system_message_prompt,
            human_message_prompt,
        ]),
    },
)

In the above code, the ConversationBufferMemory instance is passed to the ConversationalRetrievalChain constructor via the memory argument. This will allow the ConversationalRetrievalChain to use the ConversationBufferMemory for storing and retrieving conversation history.

As for your question about achieving short-term memory and long-term memory at the same time, the LangChain framework does not currently support this feature out of the box. However, you can potentially implement this feature by extending the BaseMemory class and implementing your own logic for storing and retrieving short-term and long-term memory.

I hope this helps! If you have any other questions, feel free to ask.

Best, Dosu

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@yen111445,

qa = ConversationalRetrievalChain.from_llm(
llm=ChatOpenAI(
temperature=0.7,
max_tokens=100,
model=os.getenv('QDRANT_MODEL_NAME'),
),

ConversationalRetrievalChain doesnot have load_memory_variables object instead you should use this

memory=ConversationBufferMemory(memory_key="chat_history", input_key="question", return_messages=True, k=3),
memory.load_memory_variables ({})

langchain-ai / langchain