Custom prompt to RetrievalQAWithSourcesChain ?

VladoPortos commented 1 year ago

How can I add custom prompt to:

qa_chain = load_qa_with_sources_chain(llm, chain_type="stuff",)
qa = RetrievalQAWithSourcesChain(combine_documents_chain=qa_chain, retriever=docsearch.as_retriever())

There is no prompt= for this class...

SimonB97 commented 1 year ago

+1 Encountering the same issue here

jphme commented 1 year ago

@VladoPortos can you specify?

For me it works without any problems; there are 2 prompts ("prompt" for the FewShot prompt, "document_prompt" for the inserted documents. E.g. one example I use (and which works flawlessly):


GERMAN_QA_PROMPT = PromptTemplate(template=german_few_shot_doc_prompt, input_variables=["summaries", "question"])
GERMAN_DOC_PROMPT = PromptTemplate(
    template="Inhalt: {page_content}\nQuelle: {source}",
    input_variables=["page_content", "source"])

qa_chain = load_qa_with_sources_chain(llm, chain_type="stuff",
                                      prompt=GERMAN_QA_PROMPT,
                                      document_prompt=GERMAN_DOC_PROMPT) 
chain = RetrievalQAWithSourcesChain(combine_documents_chain=qa_chain, retriever=retriever,
                                     reduce_k_below_max_tokens=True, max_tokens_limit=3375,
                                     return_source_documents=True)

(define german_few_shot_doc_prompt as you wish, see the original prompts here - https://github.com/hwchase17/langchain/blob/85dae78548ed0c11db06e9154c7eb4236a1ee246/langchain/chains/qa_with_sources/stuff_prompt.py#L4).

You probably missed the constructors because they are somewhat hidden in the loading.py file here: https://github.com/hwchase17/langchain/blob/85dae78548ed0c11db06e9154c7eb4236a1ee246/langchain/chains/qa_with_sources/loading.py#L48

blazickjp commented 1 year ago

Is there a way to add memory to this?

VladoPortos commented 1 year ago

Thanks @jpdus that seems to be the way, thanks again ! :)

Also, same question like @blazickjp is there a way to add chat memory to this ?

Currently, I was doing it in two steps, getting the answer from this chain and then chat chai with the answer and custom prompt + memory to provide the final reply. But that's two calls to API :-/

jphme commented 1 year ago

EDIT: My original tool definition doesn't work anymore as of 0.0.162, code updated

Also, same question like @blazickjp is there a way to add chat memory to this ?

Currently, I was doing it in two steps, getting the answer from this chain and then chat chai with the answer and custom prompt + memory to provide the final reply. But that's two calls to API :-/

Well, this is not that easy because the DocumentQA/DocumentCombine chains have customized prompts that imho don't really work great for a conversation chain.

In my opinion the correct solution would be to use a Chat Conversation Agent that utilizes the RetrievalQAWithSourcesChain as a tool.

Just tried it and works as expected:

from langchain.agents import Tool
from langchain.memory import ConversationBufferMemory
from langchain.chat_models import ChatOpenAI
from langchain.agents import initialize_agent
from langchain.agents import AgentType

tools = [
    Tool.from_function(
        name = "Search",
        func=chain.__call__, #chain doesn't work anymore because of some inspection issue and there is no call function
        coroutine=chain.acall, #if you want to use async
        description="useful to answer factual questions"
    ),
]
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
llm=ChatOpenAI(temperature=0)
agent_chain = initialize_agent(tools, llm, agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION, verbose=True, memory=memory)
agent_chain.run(input="Some Question")
...
agent_chain.run(input="Some follow-up Question")

where chain is the RetrievalQAWithSourcesChain as defined above.

Hope that helps!

blazickjp commented 1 year ago

@jpdus Thank you! I'm still very confused on the design aspect of this; why an agent would be needed for something that feels very much like a chain.

jphme commented 1 year ago

@jpdus Thank you! I'm still very confused on the design aspect of this; why an agent would be needed for something that feels very much like a chain.

Well it is confusing at first but makes sense on a second thought - you basically have two chains (one for the conversation and one for the document QA) and the desired output is not strictly sequential (e.g. the model has to have access to both, the conversation history and the document information, but it´s not always required in the same order).

The agent can then decide on when and how to use the retrieval to best answer the user query. I can´t really imagine any scenario where a sequential/fixed chain would work better for that use case?

VladoPortos commented 1 year ago

Thanks @jpdus I will give it a try tomorrow :)

jphme commented 1 year ago

Well, but I stand somehow corrected, in the meantime there is already another chain that probably does what you want without invoking an Agent (even if I'd probably prefer that because its way easier to extend); see:

https://python.langchain.com/en/latest/modules/chains/index_examples/chat_vector_db.html#conversationalretrievalchain-with-question-answering-with-sources

VladoPortos commented 1 year ago

But is there a way to customize promt a little for this chain ? It's not mentioned there. I would like to add some custom personality or just minor adjust the prompt it is using.

jphme commented 1 year ago

Sure, you can customize the QA prompt the same way as written above an the "condense_prompt" for the "question generator" (see example in the documentation) can be changed as well (default one can be found here: https://github.com/hwchase17/langchain/blob/85dae78548ed0c11db06e9154c7eb4236a1ee246/langchain/chains/conversational_retrieval/prompts.py#L4 ).

However if you really want to add custom personality and leverage Chat System prompts, you should imho go with the agent for more flexiblity and better customization (see https://github.com/hwchase17/langchain/blob/85dae78548ed0c11db06e9154c7eb4236a1ee246/langchain/agents/conversational_chat/base.py#L59 ).

drahoslavzan commented 1 year ago

 qa_chain = load_qa_with_sources_chain(llm, chain_type="stuff",
                                      prompt=GERMAN_QA_PROMPT,
                                      document_prompt=GERMAN_DOC_PROMPT) 
 chain = RetrievalQAWithSourcesChain(combine_documents_chain=qa_chain, retriever=retriever,
                                     reduce_k_below_max_tokens=True, max_tokens_limit=3375,
                                     return_source_documents=True)

from langchain.agents import Tool
from langchain.memory import ConversationBufferMemory
from langchain.chat_models import ChatOpenAI
from langchain.agents import initialize_agent
from langchain.agents import AgentType

tools = [
    Tool(
        name = "Search",
        func=chain,
        description="useful to answer factual questions"
    ),
]
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
llm=ChatOpenAI(temperature=0)
agent_chain = initialize_agent(tools, llm, agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION, verbose=True, memory=memory)
agent_chain.run(input="Some Question")
...
agent_chain.run(input="Some follow-up Question")

If I try your approach I will end up with the error "NotImplementedError: Saving not supported for this chain type."

The problem is here in "langchain/chains/base.py", line 45, in _chain_type, which throws, none of the chains like StuffDocumentsChain or RetrievalQAWithSourcesChain inherit and implement that property.

If I create derived classes from those two above with the property defined, the agent behaves quite strangely.

@jpdus does your approach still work, for the current version 0.0.157?

annjawn commented 1 year ago

Hey @jpdus , not to derail the conversation here (which is great). But is there a way to use Structured Response Parser with load_qa_with_sources_chain?

hetthummar commented 1 year ago

 qa_chain = load_qa_with_sources_chain(llm, chain_type="stuff",
                                      prompt=GERMAN_QA_PROMPT,
                                      document_prompt=GERMAN_DOC_PROMPT) 
 chain = RetrievalQAWithSourcesChain(combine_documents_chain=qa_chain, retriever=retriever,
                                     reduce_k_below_max_tokens=True, max_tokens_limit=3375,
                                     return_source_documents=True)
from langchain.agents import Tool
from langchain.memory import ConversationBufferMemory
from langchain.chat_models import ChatOpenAI
from langchain.agents import initialize_agent
from langchain.agents import AgentType

tools = [
    Tool(
        name = "Search",
        func=chain,
        description="useful to answer factual questions"
    ),
]
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
llm=ChatOpenAI(temperature=0)
agent_chain = initialize_agent(tools, llm, agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION, verbose=True, memory=memory)
agent_chain.run(input="Some Question")
...
agent_chain.run(input="Some follow-up Question")
If I try your approach I will end up with the error "NotImplementedError: Saving not supported for this chain type."

The problem is here in "langchain/chains/base.py", line 45, in _chain_type, which throws, none of the chains like StuffDocumentsChain or RetrievalQAWithSourcesChain inherit and implement that property.

If I create derived classes from those two above with the property defined, the agent behaves quite strangely.

@jpdus does your approach still work, for the current version 0.0.157?

I am also facing this issue. I tried this on latest version: 0.0.162

olaf-hoops commented 1 year ago

Tried to run the chain like this instead:

`tools = [Tool(func=chain.run, description=db_desc, name='Internal DB')]

conversational_agent = initialize_agent( agent='chat-conversational-react-description', tools=tools, llm=llm, verbose=True, max_iterations=4, early_stopping_method="generate", memory=memory, )`

But I'm receiving a ValueError: run not supported when there is not exactly one output key. Got ['answer', 'sources']. So, I'm getting back an answer and a source but the agent can't handle it.

jphme commented 1 year ago

@drahoslavzan @hetthummar @olaf-hoops you are right, the code doesn't work anymore in the current version (0.0.162 as of now).

There were multiple breaking changes and i can't exactly figure out which one broke the approach ( #3913 updated related docs - @vowelparrot maybe you can point out what was the root issue here?).

You have to change the tool definition to the following to get it working again:

tools = [
    Tool.from_function(
        name = "Search",
        func=chain.__call__, #chain doesn't work anymore because of some inspection issue and there is no call function
        coroutine=chain.acall, #if you want to use async
        description="useful to answer factual questions"
    ),
]

jphme commented 1 year ago

Hey @jpdus , not to derail the conversation here (which is great). But is there a way to use Structured Response Parser with load_qa_with_sources_chain?

There surely is some way - but I wouldn't recommend it because there is some custom parsing logic in the RetrievalQAWithSourcesChain, see https://github.com/hwchase17/langchain/blob/2ceb807da24e3ad7f04ff79120842982f341cda8/langchain/chains/qa_with_sources/base.py#L131

If you want to parse/transform the answer, you should probably combine the RetrievalQAWithSourcesChain with some custom Transformation Chain ( see https://python.langchain.com/en/latest/modules/chains/generic/transformation.html ) and apply the output parser aftererwards. But this depends on the exact use case.....

tomatefarcie123 commented 1 year ago

Hello @jpdus thanks for the examples above. Have you tried running this in async? For me it just hangs. I'm trying to run it with: results = await qa_chain.acall(inputs=data)

I also defined an async callback StreamingHandler to stream the results. Maybe that's blocking the execution somehow?: 'from uuid import UUID from langchain.callbacks.base import AsyncCallbackHandler from langchain.schema import LLMResult from typing import Any, Dict, List, Optional import socketio from datetime import datetime

class StreamingHandler(AsyncCallbackHandler): def init(self, sio: socketio.AsyncServer, session_id, namespace: str, color: Optional[str] = None) -> None: """Initialize callback handler.""" self.sio = sio self.session_id = session_id self.namespace = namespace self.color = color print("Initialized StreamingHandler with session_id:", self.session_id)

# This will stream the tokens to the client
async def on_llm_new_token(self, token: str, **kwargs) -> None:
    emit_time = datetime.now().strftime('%M:%S.%f')[:-3]
    print("Emitting new_token event at:", token, emit_time)
    await self.sio.emit('new_token', {"new_token": token}, namespace="/results", room=self.session_id)

# Signal to the client that generation is starting
async def on_llm_start(self, serialized: Dict[str, Any], prompts: List[str], *, run_id: UUID, parent_run_id: UUID | None = None, **kwargs: Any) -> Any:
    await self.sio.emit('first_token', namespace = "/results", room=self.session_id)

# Signal to the client that generation is ending
# async def on_llm_end(self, response: LLMResult, **kwargs: Any) -> Any:
#     await self.sio.emit('llm_end', namespace = "/results", room=self.session_id)

async def catch_all(*args, **kwargs):
    pass

on_agent_action = on_agent_finish = on_chain_end = on_chain_error = on_chain_start = on_llm_end = on_llm_error = on_text = on_tool_end = on_tool_error = on_tool_start = catch_all'

jphme commented 1 year ago

@tomatefarcie123 Yes, I tried my example above in async and it works. Please note that async execution and streaming are two related but different things and that not all chains work with streaming tokens. I would advise to first try out a non-streaming async version and/or a basic streaming LLM with your custom streaming handler.

tomatefarcie123 commented 1 year ago

Thanks for your reply. That streaming worked in sync mode with the RetrievalsQAWithSourcesChain but as you suggest I might turn it off for now until I get the rest of the async code to work. Cheers!

tevslin commented 1 year ago

The source data was being carried right into the last observation by the agent and then dropped in formulating the answer even though return-source-documents is True in initialize_agent and tyhe tool was built with VectorStoreQAWithSoucesTool. I kluged around the problem by setting return_intermediate_steps True in initialize_agent and then mining the souces from the intermediate results with:

    try:
        data=json.loads(result["intermediate_steps"][-1][-1])
        print(data["sources"])
    except:
        print("sources not available",result)

labeebee commented 1 year ago

I was also curious about this. I want to generate two answers, one crisp and informative and the other a bit sweeter and polite version of the above answer. How do you think I can accomplish this. I want the chain to also quote the sources at the end.

dsantiago commented 1 year ago

I think this how-to have the answer to how to add custom prompto to RetrievalQAWithSourcesChain: https://python.langchain.com/docs/use_cases/question_answering/integrations/openai_functions_retrieval_qa

justinlevi commented 1 year ago

@dsantiago How would I control the final prompt based on that example?


qa_chain = create_qa_with_sources_chain(ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613"))

doc_prompt = PromptTemplate(
    template="Content: {page_content}\nSource: {source}",
    input_variables=["page_content", "source"],
)

final_qa_chain = StuffDocumentsChain(
    llm_chain=qa_chain,
    document_variable_name="context",
    document_prompt=doc_prompt,
)

retrieval_qa = RetrievalQA(
    retriever=vectorstore.as_retriever(), combine_documents_chain=final_qa_chain
)

print(retrieval_qa.run("What Ukrainian dishes do we have?"))

Answer

{
  "answer": "Some popular Ukrainian dishes include borscht with pork ribs, varenyky (dumplings), holubtsi (stuffed cabbage rolls), salo (cured pork fat), kovbasa (sausage), and deruny (potato pancakes).",
  "sources": ["/recipes/borscht-with-pork-ribs"]
}

The problem here is that the output is adding information that is not in the vector db. I should be able to control that with a prompt on the RetrieverQA, but doesn't seem like I can do that.

I can pass a prompt in via:

qa_chain = RetrievalQA.from_chain_type(
    llm,
    retriever=vectorstore.as_retriever(),
    chain_type_kwargs={"prompt": QA_CHAIN_PROMPT},
)

Why can't I do that when defining the retrieval_qa via the constructor?

I'd like to add a prompt like this for the final chain:

QA_CHAIN_PROMPT = PromptTemplate.from_template("""Use the following pieces of context to answer the question at the end. 
If you don't know the answer, just say that you don't know, don't try to make up an answer. 
Answer with single sentence description of the dish. Never provide commentary on the context. 

RULES:                                               
DO NOT INCLUDE THE RECIPES IN YOUR ANSWER. Finish the answer with a question if the user would like to see the full recipe.

CONTEXT: 
{context}

Question:
{question}

""")

Mennaaah commented 1 year ago

The source data was being carried right into the last observation by the agent and then dropped in formulating the answer even though return-source-documents is True in initialize_agent and tyhe tool was built with VectorStoreQAWithSoucesTool. I kluged around the problem by setting return_intermediate_steps True in initialize_agent and then mining the souces from the intermediate results with:
    try:
        data=json.loads(result["intermediate_steps"][-1][-1])
        print(data["sources"])
    except:
        print("sources not available",result)

Hey tevslin, I am using the following to get the answers: qa_chain = RetrievalQAWithSourcesChain.from_chain_type( llm=llm, chain_type="stuff", retriever=db.as_retriever(), chain_type_kwargs=chain_type_kwargs, return_source_documents=True, verbose=True )

how can I use it with initialize_agent to get the sources please?

dosubot[bot] commented 11 months ago

Hi, @VladoPortos! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, you were asking how to add a custom prompt to the RetrievalQAWithSourcesChain class. SimonB97 and jphme have provided solutions and code examples on how to achieve this. Additionally, there have been discussions about adding memory and using the Structured Response Parser. jphme suggests using a Chat Conversation Agent for more flexibility. Some users have also encountered issues with the code not working in the latest version.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to the LangChain project. If you have any further questions or need assistance, please don't hesitate to reach out.

langchain-ai / langchain

Custom prompt to RetrievalQAWithSourcesChain ? #3523