Open da-bu opened 1 year ago
Ok, in that case you need to persist memory some where. Either in cache, or may be in database. Look in my project, i am saving it in sqlite. https://github.com/talhaanwarch/doc_chat_api langchain also provide different options such as redis, postgres etc https://python.langchain.com/en/latest/modules/memory/how_to_guides.html
I suggest you to set verbose=True, to understand process better
I am doing verbose = True and its simply not behaving as it should, the chat history does not persist between chats for some reason, and you suggesting to save it in sql light is not feasible. I wonder why this behavior is existing
You have to persist it, by default its not persistent
On Tue, 13 Jun 2023, 8:46 am Sam Savage, @.***> wrote:
I am doing verbose = True and its simply not behaving as it should, the chat history does not persist between chats for some reason, and you suggesting to save it in sql light is not feasible. I wonder why this behavior is existing
— Reply to this email directly, view it on GitHub https://github.com/hwchase17/langchain/issues/2303#issuecomment-1588478624, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5FYO3PA5OFU4IC7OZNVV3XK7PATANCNFSM6AAAAAAWQLQUF4 . You are receiving this because you were mentioned.Message ID: @.***>
You have to persist it, by default its not persistent … On Tue, 13 Jun 2023, 8:46 am Sam Savage, @.> wrote: I am doing verbose = True and its simply not behaving as it should, the chat history does not persist between chats for some reason, and you suggesting to save it in sql light is not feasible. I wonder why this behavior is existing — Reply to this email directly, view it on GitHub <#2303 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5FYO3PA5OFU4IC7OZNVV3XK7PATANCNFSM6AAAAAAWQLQUF4 . You are receiving this because you were mentioned.Message ID: @.>
I ended up combine your techniques with @esgdao techniques and its worked for my use case now! off to refine the chain this was wild to setup btw
@st.cache_resource
def init_memory():
return ConversationSummaryBufferMemory(
llm=ChatOpenAI(temperature=0.1),
output_key='answer',
memory_key='chat_history',
return_messages=True)
def retreive_best_answer(full_user_question: str):
openai.api_key = os.getenv("OPEN_API_KEY")
embeddings = OpenAIEmbeddings()
llm = ChatOpenAI(temperature=0.1)
vectordb = FAISS.load_local("merged_faiss_index", embeddings)
prompt_template_doc = """
Use chat history : {chat_history} to determine the condition you are to research if not blank
Use the following pieces of context to answer the question at the end.
{context}
If you still cant find the answer, just say that you don't know, don't try to make up an answer.
You can also look into chat history.
{chat_history}
Question: {question}
Answer:
"""
prompt_doc = PromptTemplate(
template=prompt_template_doc,
input_variables=["context", "question", "chat_history"],
)
qa = ConversationalRetrievalChain.from_llm(
ChatOpenAI(temperature=0.1),
vectordb.as_retriever(),
memory=init_memory())
results = qa({"question": full_user_question})
return results["answer"], results["chat_history"]
Thanks to the commenters who've worked on this issue. I wanted to create a chatbot with these four things:
After many iterations and some help on this thread (shout out to @esgdao for the tip about setting the prompt) I have a chatbot that can do all these requirements. Here's my setup:
Note: I'm using langchain-0.0.200
# Imports
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationalRetrievalChain
from langchain.prompts import SystemMessagePromptTemplate
from langchain.vectorstores import Chroma
from langchain.embeddings.openai import OpenAIEmbeddings
# Load the VectorDB
embeddings = OpenAIEmbeddings()
vectordb = Chroma(
collection_name='<<collection_name>>',
persist_directory=".chromadb/",
embedding_function=embeddings
)
# Create the multipurpose chain
qachat = ConversationalRetrievalChain.from_llm(
llm=ChatOpenAI(temperature=0),
retriever=vectordb.as_retriever(), # ☜ DOCSEARCH
return_source_documents=True # ☜ CITATIONS
)
# PROMPT 👇
sys_prompt = "Act as a friendly and helpful customer support rep.
Answer questions about <<company's>> products and services."
qachat.combine_docs_chain.llm_chain.prompt.messages[0] = SystemMessagePromptTemplate.from_template(sys_prompt)
# MEMORY 👇
chat_history = []
## Question 1
query = "Hi there, how are you?"
result = qachat({"question": query, "chat_history": chat_history})
print(result['answer'])
# "Hello! I'm an AI language model, so I don't have feelings, but I'm here to assist you with any questions you may have about <<company's>> products and services. How can I help you today?"
## Question 2, updating the chat_history object
chat_history = [(query, result["answer"])]
query = "Can you tell me where does X report pull from to get the Patient Confidential field?"
result = qachat({"question": query, "chat_history": chat_history})
print(result['answer'])
# "The X report pulls the Patient Confidential field from the patient\'s chart in <<Product>>. This field can be found under the "Patient Information" section of t...'"
## Question 3, testing the memory is working
chat_history = [(query, result["answer"])]
query = "Do I need a specific role to access that?" # ☜ Testing for memory
result = qachat({"question": query, "chat_history": chat_history})
print(result['answer'])
# "To access the Patient Confidential field in the patient\'s chart in <<Product>>, you need to have the "Patient Confidential" permission enabled in your user role. This permission is typically granted to users who..."
After multiple attempts and combinations this was the only setup that worked for me. Hopefully it gets worked out in future versions.
@cheevahagadog thanks so much for this.
I am using from langchain.chains import RetrievalQA
and the way I adapt your code to work with it is the following:
chain.combine_documents_chain.llm_chain.prompt.messages[0] = SystemMessagePromptTemplate.from_template(sys_prompt)
Now I am wondering, is there no a more "elegant" way to declare an initial prompt?
It seems a kind of hack done it in such a way
Thanks again
@robertocommit There is a way that's not a hack, actually. If you use ConversationalRetrievalChain.from_llm
you can provide this as a parameter:
combine_docs_chain_kwargs={'prompt': prompt}
Not sure if this works with RetrievalQA, though.
@robertocommit It looks like BaseRetrievalQA.from_llm()
(and hence RetrievalQA
) accepts a prompt
argument, so you should be able to supply a custom prompt there.
This worked for me. You can bypass the function with get_chat_history=lambda h:h,
and this just returns the str.
Like in ConversationalRetrievalChain
memory=ConversationBufferMemory(memory_key="chat_history")
chat = ConversationalRetrievalChain.from_llm(
memory=ConversationBufferMemory(memory_key="chat_history"),
get_chat_history=lambda h:h,
...
)
However if you want a list of messages from memory to pass through to get_chat_history
then add return_messages=True
to ConversationBufferMemory. Like
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
It then passes the correct format to get_chat_history.
I would like to contribute to a fix for this. I am unable to find any option to assign this issue to myself, can someone help.
Dear friends,
I have been having problems for several days and would greatly appreciate some help.
First of all, my Python version is 3.11.4, and my LangChain version is 0.0.224.
I am using ConversationalRetrievalChain with ConversationBufferWindowMemory (or ConversationBufferMemory), but I have tried several combinations, and:
One thing I noticed is that when I tried to use a suggestion from this forum, which was to use qa_prompt=QA_PROMPT, it gave an error saying that this parameter does not exist.
The CONDENSE PROMPT asks it to rephrase the question, and it seems to be the cause of it repeating my reformulated question.
Has anyone here been able to make this work recently?
Is there an alternative to ConversationalRetrievalChain without the condense_prompt?
Thank you very much!
chain({ "question": "What was the last question I asked you.", "chat_history": history }, return_only_outputs=True)
where did you get that 'history' variable?
edit: fixed now
def qna(question: str, vector_name: str, chat_history=[]):
logging.debug("Calling qna")
llm, embeddings, llm_chat = pick_llm(vector_name)
vectorstore = pick_vectorstore(vector_name, embeddings=embeddings)
retriever = vectorstore.as_retriever(search_kwargs=dict(k=3))
prompt = pick_prompt(vector_name)
logging.basicConfig(level=logging.DEBUG)
logging.debug(f"Chat history: {chat_history}")
qa = ConversationalRetrievalChain.from_llm(ChatOpenAI(model="gpt-4", temperature=0.2, max_tokens=5000),
retriever=retriever,
return_source_documents=True,
verbose=True,
output_key='answer',
combine_docs_chain_kwargs={'prompt': prompt},
condense_question_llm=ChatOpenAI(model="gpt-3.5-turbo", temperature=0))
try:
result = qa({"question": question, "chat_history": chat_history})
except Exception as err:
error_message = traceback.format_exc()
result = {"answer": f"An error occurred while asking: {question}: {str(err)} - {error_message}"}
logging.basicConfig(level=logging.INFO)
return result
How to handle such scenarios if the current asked questions is totally a new question and does not have relation with previous chat history . The standalone question would probably be non-sense, as it's semantic is twisted by the chat history.
`OPENAI_API_KEY=xxxxxx OPENAI_API_BASE=https://xxxxxxxx.openai.azure.com/ OPENAI_API_VERSION=2023-05-15 import os import openai from dotenv import load_dotenv from langchain.chat_models import AzureChatOpenAI from langchain.embeddings import OpenAIEmbeddings
Load environment variables (set OPENAI_API_KEY, OPENAI_API_BASE, and OPENAI_API_VERSION in .env) load_dotenv()
Configure OpenAI API openai.api_type = "azure" openai.api_base = os.getenv('OPENAI_API_BASE') openai.api_key = os.getenv("OPENAI_API_KEY") openai.api_version = os.getenv('OPENAI_API_VERSION')
Initialize gpt-35-turbo and our embedding model llm = AzureChatOpenAI(deployment_name="gpt-35-turbo") embeddings = OpenAIEmbeddings(deployment_id="text-embedding-ada-002", chunk_size=1) from langchain.document_loaders import DirectoryLoader from langchain.document_loaders import TextLoader from langchain.text_splitter import TokenTextSplitter
loader = DirectoryLoader('data/qna/', glob="*.txt", loader_cls=TextLoader, loader_kwargs={'autodetect_encoding': True})
documents = loader.load() text_splitter = TokenTextSplitter(chunk_size=1000, chunk_overlap=0) docs = text_splitter.split_documents(documents) from langchain.vectorstores import FAISS db = FAISS.from_documents(documents=docs, embedding=embeddings)
from langchain.chains import ConversationalRetrievalChain from langchain.prompts import PromptTemplate
Adapt if needed CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template("""Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question.
Chat History: {chat_history} Follow Up Input: {question} Standalone question:""")
qa = ConversationalRetrievalChain.from_llm(llm=llm, retriever=db.as_retriever(), condense_question_prompt=CONDENSE_QUESTION_PROMPT, return_source_documents=True, verbose=False) `
Hi everyone, I have figured out a workaround for sending in the entire concatenated memory into a ConversationalRetrievalChain and bypassing the question condensing chain. This workaround builds on this answer.
First, create a no-op LLM chain that we will use as the question generator. This will directly pass the question to the combine_docs chain, bypassing the question condensation step:
from langchain import LLMChain, PromptTemplate
from langchain.chat_models import ChatOpenAI
class NoOpLLMChain(LLMChain):
"""No-op LLM chain."""
def __init__(self):
"""Initialize."""
super().__init__(llm=ChatOpenAI(), prompt=PromptTemplate(template="", input_variables=[]))
def run(self, question: str, *args, **kwargs) -> str:
return question
Instantiate a memory object with output_key='answer':
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(memory_key='chat_history', output_key='answer', return_messages=True)
Instantiate a convRQA chain using the from_llm method, and replace the default question_generator with our no-op chain:
from langchain.chains import ConversationalRetrievalChain
conv_rqa = ConversationalRetrievalChain.from_llm(llm=llm,
chain_type="stuff",
verbose="True",
memory = memory,
retriever=retriever,
return_source_documents = True)
no_op_chain = NoOpLLMChain()
conv_rqa.question_generator = no_op_chain
Now we will modify the default combine_docs_chain
system message prompt to include the chat history at the end. You can also modify this prompt to tailor to your use case. We also need to add 'chat_history'
as a variable to the ChatPromptTemplate object in combine_docs_chain.llm_chain.prompt
:
from langchain.prompts.chat import SystemMessagePromptTemplate
modified_template = "Use the following pieces of context to answer the users question. \nIf you don't know the answer, just say that you don't know, don't try to make up an answer.\n----------------\n{context}\nChat History:\n{chat_history}"
system_message_prompt = SystemMessagePromptTemplate.from_template(modified_template)
conv_rqa.combine_docs_chain.llm_chain.prompt.messages[0] = system_message_prompt
# add chat_history as a variable to the llm_chain's ChatPromptTemplate object
conv_rqa.combine_docs_chain.llm_chain.prompt.input_variables = ['context', 'question', 'chat_history']
This is working successfully for me. Hope this helps!
@hemanthkrishna1298 Thanks a lot, this works for me as well!
I'm running into the next challenge now: Since my index is pretty big and has a lot of different documents, when I ask a generic follow-up question like "Give more details on the previous answer", my (Pinecone) retriever retrieves a bunch of documents completely unrelated to the previous answers. It seems like the retriever is only taking into account the latest question, but not the chat history. (Still hallucinates more details, so answer sounds not too bad, but ideally should of course take a closer look at the documents from the previous answer(s).)
How are you dealing with that?
@clauslang I haven't gotten so far with testing yet, but I'm sure I'll bump into the same thing. Maybe it helps to see what the history property contains, which to my knowledge is used as a context to answer the question of the user. If the history only contain the latest answer, then that's the problem. So I would first verify your assumption, i.e. "the retriever is only talking into account the latest question (answer?)." and then go from there.
So turns out the history property does contain the complete history, but that doesn't matter because ConversationalRetrievalChain
simply doesn't pass the history (hidden in the inputs
parameter) to the retriever when asking for the relevant documents:
class ConversationalRetrievalChain(BaseConversationalRetrievalChain):
# ...
def _get_docs(self, question: str, inputs: Dict[str, Any]) -> List[Document]:
docs = self.retriever.get_relevant_documents(question)
return self._reduce_tokens_below_limit(docs)
# ...
(Might be from a previous langchain version, latest one here.)
My first hacky attempt at also taking into account the history seems to work ok:
class MyConversationalRetrievalChain(ConversationalRetrievalChain):
def _get_docs(self, question: str, inputs: Dict[str, Any]) -> List[Document]:
history = '\n\n'.join(['{}: "{}"'.format(message.type, message.content) for message in inputs['chat_history']])
question_with_history = 'question: "{}"\n\nchat history:\n\n{}'.format(question, history)
docs = self.retriever.get_relevant_documents(question_with_history)
return self._reduce_tokens_below_limit(docs)
So I think the issue here is that the
BaseChatMemory
gets all confused when the output it receives contains more then one key, and it doesn't know which one to assign as the answer; it's in this code here:if self.output_key is None: if len(outputs) != 1: raise ValueError(f"One output key expected, got {outputs.keys()}") output_key = list(outputs.keys())[0] else:
When you have
return_source_documents=True,
the output has two keys:answer
andsource_documents
, and that causes this to throw an error.The workaround that got this working for me was to specify
answer
as the output key when creating this ConversationBufferMemory object. Then it doesn't have to try to guess at what the output_key is.memory = ConversationBufferMemory( memory_key='chat_history', return_messages=True, output_key='answer')
This is gold!
LLMChain -> 'text' RetrievalQA -> {'question', 'result', 'source_documents'} ConversationalRetrievalChain -> {'question', 'answer', 'source_documents'}
If you are using memory with each chain type
if the chain output has only one key memory will get the output by default.
if there is more than 1 output keys: use the relevant output key for the chain for example in ConversationalRetrievalChain
memory = ConversationBufferMemory(
memory_key='chat_history', return_messages=True, output_key='answer'
)
you will need to modify one of
OutputParser
__get_inputoutput function in class BaseChatMemory
for example, ConversationalRetrievalChain with ZeroShotAgent
def _get_input_output(
self, inputs: Dict[str, Any], outputs: Dict[str, str]
) -> Tuple[str, str]:
if self.input_key is None:
prompt_input_key = get_prompt_input_key(inputs, self.memory_variables)
else:
prompt_input_key = self.input_key
if self.output_key is None:
"""
output for agent with LLM chain tool = {answer}
output for agent with ConversationalRetrievalChain tool = {'question', 'chat_history', 'answer','source_documents'}
"""
LLM_key = 'output'
Retrieval_key = 'answer'
if isinstance(outputs[LLM_key], dict):
Retrieval_dict = outputs[LLM_key]
if Retrieval_key in Retrieval_dict.keys():
#output keys are 'answer' , 'source_documents'
output = Retrieval_dict[Retrieval_key]
else:
raise ValueError(f"output key: {LLM_key} not a valid dictionary")
else:
#otherwise output key will be 'output'
output_key = list(outputs.keys())[0]
output = outputs[output_key]
# if len(outputs) != 1:
# raise ValueError(f"One output key expected, got {outputs.keys()}")
else:
output_key = self.output_key
output = outputs[output_key]
return inputs[prompt_input_key], output
Not sure if this is still relevant, but I found that the best way to obtain chat history with the ConversationalRetrieval/RetrievalQA chain was to use it as a tool with an agent (similar to this: https://python.langchain.com/docs/modules/agents/how_to/agent_vectorstore) or to directly use the Conversational Retrieval agent (https://python.langchain.com/docs/use_cases/question_answering/how_to/conversational_retrieval_agents). This solved all my issues with memory/chat history, especially when integrating with streamlit.
Thanks, I'll check it out.
Em dom., 17 de set. de 2023 às 21:52, Pranavi Shekhar < @.***> escreveu:
Not sure if this is still relevant, but I found that the best way to obtain chat history with the ConversationalRetrieval/RetrievalQA chain was to use it as a tool with an agent (similar to this: https://python.langchain.com/docs/modules/agents/how_to/agent_vectorstore) or to directly use the Conversational Retrieval agent ( https://python.langchain.com/docs/use_cases/question_answering/how_to/conversational_retrieval_agents). This solved all my issues with memory/chat history, especially when integrating with streamlit.
— Reply to this email directly, view it on GitHub https://github.com/langchain-ai/langchain/issues/2303#issuecomment-1722626831, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGFVVT6YABRIFT6RUQBXN3X26LMVANCNFSM6AAAAAAWQLQUF4 . You are receiving this because you commented.Message ID: @.***>
The main reason for these type issues is the inconsistency of output keys in Langchain chains.
LLMChain -> 'text' RetrievalQA -> {'question', 'result', 'source_documents'} ConversationalRetrievalChain -> {'question', 'answer', 'source_documents'}
If you are using memory with each chain type
- if the chain output has only one key memory will get the output by default.
- if there is more than 1 output keys: use the relevant output key for the chain for example in ConversationalRetrievalChain
memory = ConversationBufferMemory( memory_key='chat_history', return_messages=True, output_key='answer' )
If you are using Langchain agents the output key is 'output'
you will need to modify one of
OutputParser
__get_inputoutput function in class BaseChatMemory
for example, ConversationalRetrievalChain with ZeroShotAgent
def _get_input_output( self, inputs: Dict[str, Any], outputs: Dict[str, str] ) -> Tuple[str, str]: if self.input_key is None: prompt_input_key = get_prompt_input_key(inputs, self.memory_variables) else: prompt_input_key = self.input_key if self.output_key is None: """ output for agent with LLM chain tool = {answer} output for agent with ConversationalRetrievalChain tool = {'question', 'chat_history', 'answer','source_documents'} """ LLM_key = 'output' Retrieval_key = 'answer' if isinstance(outputs[LLM_key], dict): Retrieval_dict = outputs[LLM_key] if Retrieval_key in Retrieval_dict.keys(): #output keys are 'answer' , 'source_documents' output = Retrieval_dict[Retrieval_key] else: raise ValueError(f"output key: {LLM_key} not a valid dictionary") else: #otherwise output key will be 'output' output_key = list(outputs.keys())[0] output = outputs[output_key] # if len(outputs) != 1: # raise ValueError(f"One output key expected, got {outputs.keys()}") else: output_key = self.output_key output = outputs[output_key] return inputs[prompt_input_key], output
Not sure why, but I had to set the output_key = 'answer' on the ConversationalRetrieval object as well. It wouldn't work for me just on the memory object.
How to maintain User specific chat_history/memory ? When I built a streamlit app from Cloud Run service, with multiple Users asking questions, it seems to be using the combined chat_history of all Users and responding. How this needs to be handled ?
@nickmuchi87 please see @ToddKerpelman 's answer, add the output_key='answer' in the ConversationBufferMemory. This worked for me.
memory = ConversationBufferMemory( memory_key='chat_history', return_messages=True, output_key='answer')
yea it works with output_key='answer'
I hope this below chain can solve your issue,
qa = ConversationalRetrievalChain.from_llm(
llm=ChatOpenAI(model_name=llm_name, temperature=0),
chain_type=chain_type,
retriever=retriever,
return_source_documents=True,
return_generated_question=True,
)
Hi there,
In my chat application, I've been able to return either the "source_documents" using the ConversationBufferMemory, or the "answer", but not both.
memory = ConversationBufferMemory(
memory_key='chat_history', return_messages=True, output_key = 'source_documents') # works with either "answer" or "source_documents", but not both
# set up generic retriever
retriever = vector_store.as_retriever(search_type="similarity", search_kwargs={"k":8})
conversation_chain = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=retriever,
memory=memory,
return_source_documents=True
)
Any suggestions would be great. Is it possible to initialise two separate ConversationBufferMemories and then somehow manually append "source_documents" to "answer" if possible? Seems hack-y, but interested in trying...
So I think the issue here is that the
BaseChatMemory
gets all confused when the output it receives contains more then one key, and it doesn't know which one to assign as the answer; it's in this code here:if self.output_key is None: if len(outputs) != 1: raise ValueError(f"One output key expected, got {outputs.keys()}") output_key = list(outputs.keys())[0] else:
When you have
return_source_documents=True,
the output has two keys:answer
andsource_documents
, and that causes this to throw an error.The workaround that got this working for me was to specify
answer
as the output key when creating this ConversationBufferMemory object. Then it doesn't have to try to guess at what the output_key is.memory = ConversationBufferMemory( memory_key='chat_history', return_messages=True, output_key='answer')
Thank you so much, it solved my problem.
Hello all, I am planning to publish my code as REST API so that consumer can pass question and chat_history as input parameters, I went through whole thread and tried multiple suggestion's but nothing worked to set initial chat history.
Please let me know if someone tried publishing ConversationalRetrievalChain as REST API. Here is my code snippet.
condense_question_prompt = PromptTemplate.from_template(""" Use the following pieces of context and chat history to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. Chat history: {chat_history} Question: {question} inputVariables: [ "question", "chat_history"] """)
conversation_chain= ConversationalRetrievalChain.from_llm( llm=ChatOpenAI(model_name='gpt-3.5-turbo-1106', temperature=0.5), retriever=vector_store.as_retriever(), get_chat_history=lambda h :h, memory=ConversationBufferMemory(memory_key='chat_history', return_messages=True, output_key='answer'), chain_type="stuff", condense_question_prompt=condense_question_prompt, )
chat_history = [] chat_history.append(('What is the date Australia was founded','Australia was founded in 1901'))
question = "What was my last question" answer = conversation_chain({"question":question,"chat_history":chat_history}, return_only_outputs=True) print("answer::", answer)
answer:: {'answer': "I don't know, could you please specify your question?"}
output_key='answer'
@esgdao Can u pls share the source code here, actually its getting very messy
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True,output_key='answer') question_generator = LLMChain(llm=llm, prompt=prompt_1) doc_chain = load_qa_with_sources_chain(llm,chain_type="map_reduce") chain = ConversationalRetrievalChain(retriever=me, question_generator=question_generator, combine_docs_chain=doc_chain, return_source_documents=True, memory=memory, verbose=False, rephrase_question=True, return_generated_question=True )
I'm utilizing the map_reduce chain type for its ability to provide pertinent answers along with the source document name. However, I've noticed that the time taken for answer generation is quite long, typically ranging from 20 to 30 seconds. Could anyone offer suggestions on how to enhance the speed of this process
def retreival_qa_chain(query,COLLECTION_NAME,prompt):
embedding = OpenAIEmbeddings()
llm = ChatOpenAI(temperature=0.1,model_name="gpt-3.5-turbo 16k",openai_api_key=env('OPENAI_API_KEY'))
memory = ConversationBufferMemory(llm=llm,output_key='answer',memory_key='chat_history',return_messages=True)
print(memory.load_memory_variables({}),"--++++++++++-------------------------------------------")
vector_store = PGVector(
connection_string=CONNECTION_STRING,
collection_name=COLLECTION_NAME,
embedding_function=embedding
)
retriever = vector_store.as_retriever(search_kwargs={"k": 3})
chain = ConversationalRetrievalChain.from_llm(llm=llm,memory=memory,chain_type="stuff",
combine_docs_chain_kwargs={'prompt': prompt},
retriever=retriever,
return_source_documents=True,
get_chat_history=lambda h : h,
verbose=True)
return chain
I've been struggling with an issue for the past day, and I would be incredibly grateful if you could assist me. When I run memory.load_memory_variables({}), I'm getting an empty array and its not able to store previous conversation and got output : {'chat_history': []} . Could you please help me out? Your assistance would be greatly appreciated. Thank you! @My3VM, @BidishaAdhikari, @pranavi-shekhar @hemanthkrishna1298
langchain.chains.conversational_retrieval
is whereConversationalRetrievalChain
lives in the Langchain source code. In that same location is a module calledprompts.py
which contains bothCONDENSE_QUESTION_PROMPT
andQA_PROMPT
. But there's no mention ofqa_prompt
inConversationalRetrievalChain
, or its base chainBaseConversationalRetrievalChain
, or even its base chain,Base
.That's why I was getting the Pydantic error,
qa_prompt extra fields not permitted (type=value_error.extra)
.qa_prompt
is not part ofConversationalRetrievalChain
.A workaround is to insert your custom
PromptTemplate
into the chain after it's been defined. You have to go very deep into the chain, though. For this example I've defined my prompt asprompt
and my chain aschain
.First, import
SystemMessagePromptTemplate
. Set up your chain as usual, then execute the line below the import:from langchain.prompts.chat import SystemMessagePromptTemplate chain.combine_docs_chain.llm_chain.prompt.messages[0] = SystemMessagePromptTemplate(prompt=prompt)
This workaround works for me. Hopefully this will be made easier by the Langchain team in future. If not, I'll just leave it my code. But in the meantime I now have a fully working
ConversationalRetrievalChain
withConversationSummaryBufferMemory
and a custom prompt.
can you provide the full reference..
@sivakumar41 Hi, I am working on a Chatbot where a user can create multiple projects where each project contain same or different file, the problem is in switching from one project qa object to other project qa object. What happened is each time I am asking a question to it, a new chat object is created from ConversationalRetrievalChain which will overwrite the previous memory and start's fresh. Can you please explain, how can I overcame from these solutions ?? Really stucked into it
I'm using django-rest in a project with langchain, it happens that I use my chroma db and I have a chat with my great data, but the problem I have is that it doesn't keep my chat_history it gets deleted and I've searched a lot and I haven't been able to solve it, who would be able to tell me what I'm doing wrong?
def handle_chat(self, request):
chat_history = []
if os.path.exists(self.persist_directory):
custom_template = """Given the following conversation and a follow-up message,\
Rephrase the follow-up message into a separate question or instruction that \
represents the user's intent, add all the necessary context if necessary to generate a complete file and\
Unambiguous questions or instructions, only based on the story, do not invent messages, take all the questions and process them as one, if applicable, give a single answer if this warrants it.\
Keep the same language as the follow-up input message.
Chat History:
{chat_history}
Follow Up Input: {question}
Standalone question or instruction:"""
memory = ConversationBufferMemory(
memory_key='chat_history', return_messages=True, output_key='answer')
vectorstore = Chroma(
persist_directory=self.persist_directory, embedding_function=OpenAIEmbeddings())
chain = ConversationalRetrievalChain.from_llm(
llm=ChatOpenAI(model=self.model_name),
retriever=vectorstore.as_retriever(
search_kwargs={"k": 1}),
memory=memory,
condense_question_prompt=PromptTemplate.from_template(
custom_template),
)
else:
print("No existing index found. Please create one before using this view.")
vectorstore = None
chain = None
if not chain:
return Response({"error": "Initialization error or index not found."},
status=status.HTTP_500_INTERNAL_SERVER_ERROR)
messages = request.data.get('messages', [])
if len(messages) > 1:
aggregated_body = '\n'.join(f"{index + 1}- {msg.get('body')}" for index, msg in enumerate(messages) if msg.get('body'))
response_body = aggregated_body
elif len(messages) == 1 and messages[0].get("body"):
response_body = messages[0].get("body")
else:
return Response({"error": "No valid messages found."}, status=status.HTTP_400_BAD_REQUEST)
if messages:
response = chain.invoke({"question": response_body, "chat_history": chat_history})
answer = response.get("answer", "No se encontró respuesta.")
chat_history.append((response_body, answer))
print(chat_history)
else:
answer = "Lo siento, no tengo una respuesta concreta para tu pregunta."
return Response({"responses": [{"response": answer}]}, status=status.HTTP_200_OK)
I'm using django-rest in a project with langchain, it happens that I use my chroma db and I have a chat with my great data, but the problem I have is that it doesn't keep my chat_history it gets deleted and I've searched a lot and I haven't been able to solve it, who would be able to tell me what I'm doing wrong?
def handle_chat(self, request): chat_history = [] if os.path.exists(self.persist_directory): custom_template = """Given the following conversation and a follow-up message,\ Rephrase the follow-up message into a separate question or instruction that \ represents the user's intent, add all the necessary context if necessary to generate a complete file and\ Unambiguous questions or instructions, only based on the story, do not invent messages, take all the questions and process them as one, if applicable, give a single answer if this warrants it.\ Keep the same language as the follow-up input message. Chat History: {chat_history} Follow Up Input: {question} Standalone question or instruction:""" memory = ConversationBufferMemory( memory_key='chat_history', return_messages=True, output_key='answer') vectorstore = Chroma( persist_directory=self.persist_directory, embedding_function=OpenAIEmbeddings()) chain = ConversationalRetrievalChain.from_llm( llm=ChatOpenAI(model=self.model_name), retriever=vectorstore.as_retriever( search_kwargs={"k": 1}), memory=memory, condense_question_prompt=PromptTemplate.from_template( custom_template), ) else: print("No existing index found. Please create one before using this view.") vectorstore = None chain = None if not chain: return Response({"error": "Initialization error or index not found."}, status=status.HTTP_500_INTERNAL_SERVER_ERROR) messages = request.data.get('messages', []) if len(messages) > 1: aggregated_body = '\n'.join(f"{index + 1}- {msg.get('body')}" for index, msg in enumerate(messages) if msg.get('body')) response_body = aggregated_body elif len(messages) == 1 and messages[0].get("body"): response_body = messages[0].get("body") else: return Response({"error": "No valid messages found."}, status=status.HTTP_400_BAD_REQUEST) if messages: response = chain.invoke({"question": response_body, "chat_history": chat_history}) answer = response.get("answer", "No se encontró respuesta.") chat_history.append((response_body, answer)) print(chat_history) else: answer = "Lo siento, no tengo una respuesta concreta para tu pregunta." return Response({"responses": [{"response": answer}]}, status=status.HTTP_200_OK)
@fckfck97
Your creating the memory object for every function call , Thats Why every time new object is creating and cant able to remember your chat history due to creation of memory object every time, memory = ConversationBufferMemory(
memory_key='chat_history', return_messages=True, output_key='answer')
Keep this code outside of the function
why chat_history is not working in my case , here is my code , can someone give me a hand : Ps: I'm storing the chat history in a MongoDB database and then retrieving it from there, but it doesn't seem to be integrated with the chain
def make_chain(session_id ,chat_history): QA_PROMPT_DOCUMENT_CHAT ="""......"
Chat history: {chat_history} Context:{context}
Question:{question} input_variables : ["context","question","chat_history"] Helpful Answer:"""
custom_prompt = PromptTemplate( input_variables=["context","question","chat_history"], template=QA_PROMPT_DOCUMENT_CHAT ) model = ChatOpenAI( model_name="gpt-3.5-turbo", temperature="0.7",
)
global memory memory = ConversationBufferMemory(
chat_memory=ChatMessageHistory(messages=[]), memory_key="chat_history", return_messages=True , input_key='question', output_key='answer', chat_history_key= 'chat_history'
)
embeddings = OpenAIEmbeddings()
vector_store = Chroma( collection_name="data", embedding_function=embeddings, persist_directory="chroma", )
chain= ConversationalRetrievalChain.from_llm( llm=model, retriever=vector_store.as_retriever(), return_source_documents=False, combine_docs_chain_kwargs={"prompt": custom_prompt}, condense_question_prompt=custom_prompt, memory=memory,
get_chat_history=lambda h : h, output_key='answer'
) system_message_prompt = SystemMessagePromptTemplate.from_template(QA_PROMPT_DOCUMENT_CHAT)
return chain
def gen_answer(user_input,session_id,chat_history_input=None):
chat_historycollection = db[f"session{session_id}"] chat_history_docs = chat_history_collection.find()
chat_history = []
for doc in chat_history_docs: role = doc["role"] content = doc["content"]
if role == "human":
chat_history.append(HumanMessage(content=content))
elif role == "ai":
chat_history.append(AIMessage(content=content))
chain = make_chain(session_id=session_id,chat_history=chat_history)
question = user_input
response = chain({"question": question, "chat_history": chat_history })
chat_history.append(HumanMessage(content=question))
print("Fetched chat history:", chat_history)
print("Response:", response)
answer = response["answer"]
return answer
Hi,
I'm following the Chat index examples and was surprised that the history is not a Memory object but just an array. However, it is possible to pass a memory object to the constructor, if
- I also set memory_key to 'chat_history' (default key names are different between ConversationBufferMemory and ConversationalRetrievalChain)
- I also adjust get_chat_history to pass through the history from the memory, i.e. lambda h : h.
This is what that looks like:
memory = ConversationBufferMemory(memory_key='chat_history', return_messages=False) conv_qa_chain = ConversationalRetrievalChain.from_llm( llm=llm, retriever=retriever, memory=memory, get_chat_history=lambda h : h)
Now, my issue is that if I also want to return sources that doesn't work with the memory - i.e. this does not work:
memory = ConversationBufferMemory(memory_key='chat_history', return_messages=False) conv_qa_chain = ConversationalRetrievalChain.from_llm( llm=llm, retriever=retriever, memory=memory, get_chat_history=lambda h : h, return_source_documents=True)
The error message is "ValueError: One output key expected, got dict_keys(['answer', 'source_documents'])".
Maybe I'm doing something wrong? If not, this seems worth fixing to me - or, more generally, make memory and the ConversationalRetrievalChain more directily compatible?
chat_history = ['user_query', 'ai_response'] if chat_history: memory = ConversationBufferWindowMemory(memory_key="chat_history", output_key="answer", return_messages=True) memory.chat_memory.add_user_message(chat_history[0]) memory.chat_memory.add_ai_message(chat_history[1]) else: memory = ConversationBufferWindowMemory(k=0, memory_key="chat_history", output_key="answer", return_messages=True) print(memory) chain = ConversationalRetrievalChain.from_llm(llm=model, retriever=retriever, memory= memory, get_chat_history=lambda h : h, return_source_documents = True, combine_docs_chain_kwargs = {'prompt': prompt_template}, verbose= verbose ) response = chain.invoke({"question":user_question, "chat_history": chat_history})
Hi,
I'm following the Chat index examples and was surprised that the history is not a Memory object but just an array. However, it is possible to pass a memory object to the constructor, if
This is what that looks like:
Now, my issue is that if I also want to return sources that doesn't work with the memory - i.e. this does not work:
The error message is "ValueError: One output key expected, got dict_keys(['answer', 'source_documents'])".
Maybe I'm doing something wrong? If not, this seems worth fixing to me - or, more generally, make memory and the ConversationalRetrievalChain more directily compatible?