multiple prompt in a single RAG chain

mhyeonsoo commented 7 months ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.

Example Code

llm = VertexAI(model_name='gemini-pro', max_output_tokens=512, temperature=0.2)
data = loader.load()
embeddings = VertexAIEmbeddings(model_name='textembedding-gecko@003')
vectorstore = Chroma.from_documents(data, embeddings)
retriever = vectorstore.as_retriever()

template = """
Use the following pieces of context to answer the question at the end.
- If you think you need more information to find proper answer, ask user to clarify.
- If you cannot find a proper answer from the reference, just say we don't have a data from the reference.
Use three sentences maximum and keep the answer as concise as possible.

Context: {context}
Question: {question}
Helpful Answer:

**Resource:** [reference source name]
"""

template2 = """
If the user query lacks of below data, please ask user to get additional data.
- data1
- data2

If the user query has enough data, response with the context directly.
Context: {context}
Question: {question}
Answer: 
"""    

rag_prompt_custom = ChatPromptTemplate.from_template(template)
multi_turn_prompt_custom = ChatPromptTemplate.from_template(template2)  
rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | rag_prompt_custom
    | llm
    | StrOutputParser()
)

response = rag_chain.invoke(message)

print(response)

Description

I would like to have two prompt based chat in a single RAG chain. Below is what I am thinking about.

based on query and context information from the document, fine the proper answer.
with the generated (found) answer and the user's prompt, consider if the user's prompt lacks of data. If it lacks of data, ask user to get addiitonal data. And if there is enough data to generate and find the answer from the document, response found answer directly.

I have no idea how to make it merged.

System Info

langchain==0.0.352

Related Components

[X] LLMs/Chat Models
[ ] Embedding Models
[X] Prompts / Prompt Templates / Prompt Selectors
[ ] Output Parsers
[ ] Document Loaders
[ ] Vector Stores / Retrievers
[ ] Memory
[ ] Agents / Agent Executors
[ ] Tools / Toolkits
[X] Chains
[ ] Callbacks/Tracing
[ ] Async

dosubot[bot] commented 7 months ago

🤖

Hello @mhyeonsoo! I'm Dosu, an AI here to assist you with your inquiries and issues related to LangChain. While you're waiting for a human maintainer, feel free to ask me any questions or seek clarifications. Looking forward to helping you!

Based on your requirements, it seems you want to use two different prompts in a single RAG chain. You can achieve this by using the RunnableParallel class in LangChain. This class allows you to run multiple tasks concurrently, which is useful when you want to process the same input in different ways simultaneously.

Here's an example of how you can use RunnableParallel with two different prompts:

# Define your prompts
template1 = """
Use the following pieces of context to answer the question at the end.
- If you think you need more information to find proper answer, ask user to clarify.
- If you cannot find a proper answer from the reference, just say we don't have a data from the reference.
Use three sentences maximum and keep the answer as concise as possible.

Context: {context}
Question: {question}
Helpful Answer:

**Resource:** [reference source name]
"""

template2 = """
If the user query lacks of below data, please ask user to get additional data.
- data1
- data2

If the user query has enough data, response with the context directly.
Context: {context}
Question: {question}
Answer: 
"""    

prompt1 = ChatPromptTemplate.from_template(template1)
prompt2 = ChatPromptTemplate.from_template(template2)

# Define your RAG chain
rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | RunnableParallel({"prompt1": prompt1, "prompt2": prompt2})
    | llm
    | StrOutputParser()
)

response = rag_chain.invoke(message)
print(response)

In this code, template1 and template2 are your two different prompts. prompt1 and prompt2 are created from these templates. Then, in the RunnableParallel instance, both prompts are used. The RunnableParallel class allows running multiple tasks in parallel, so it can handle multiple prompts. The output from both prompts is then passed to the model.

Please note that the RunnableParallel class is not directly shown in the provided context, but it's mentioned in the description of the Runnable class. According to the description, you can construct a RunnableParallel instance using a dictionary literal within a sequence or by passing a dictionary to RunnableParallel.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources:

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

mhyeonsoo commented 7 months ago

Thanks @dosu-bot,

I miss-wrote some part in the second prompt. I need to use response from prompt1 in the prompt2, which looks like,

template2 = """
If the user query lacks of below data, please ask user to get additional data.
- data1
- data2

If the user query has enough data, response with the context directly.
Context: {generated response from prompt1 processing}
Question: {question}
Answer: 
"""

Can you help me with this?

dosubot[bot] commented 7 months ago

🤖

Based on your requirements, you can modify your RAG chain to include two prompts, where the second prompt uses the response generated from the first prompt's processing. Here's how you can do it:

# Define your second prompt:
second_template = """
If the user query lacks of below data, please ask user to get additional data.
- data1
- data2

If the user query has enough data, response with the context directly.
Context: {first_response}
Question: {question}
Answer: 
"""
second_prompt = ChatPromptTemplate.from_template(second_template)

# Define a second model (it can be the same as the first one)
second_llm = VertexAI(model_name='gemini-pro', max_output_tokens=512, temperature=0.2)

# Modify the RAG chain
rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | rag_prompt_custom
    | llm
    | RunnablePassthrough(key="first_response")
    | second_prompt
    | second_llm
    | StrOutputParser()
)

In this code, second_prompt is a new prompt that takes the output of the first model (the response to the first prompt) as input. The second_llm then processes this new prompt. The output of the second_llm is then parsed by StrOutputParser().

Please note that the second_template is a string that formats the second prompt. It uses {first_response} as a placeholder for the response from the first prompt. The actual value of {first_response} is automatically filled in by the LangChain framework when the chain is run.

The RunnablePassthrough(key="first_response") is used to pass the output of the first model to the second prompt. It takes the output of the previous stage and adds it to the input dictionary for the next stage under the specified key.

I hope this helps! If you have any other questions, feel free to ask.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

SudeepBatobyal commented 2 months ago

INTENT_TEMPLATE=""" You are a master of "User Intent Detection" for a chatbot. Your task is to find the intent of user query based on user_query:{query_text}, you can also use Relevant context: {context_text} & chat history: {history_buffer} if appropriate.

Go through all of the Intent_Categories:

"Ticket": User wants to create a helpdesk or support or a security ticket.
"Feedback": User wants to give a feedback or a suggestion.
"Out of Context": Context fetched {context_text} with respect to the user query {query_text} seems irrelevant.
"Information Seeking": User intent seems to seek information.
"Greeting & GoodByes": User is greeting or bidding farewell.
"Gratitude": User is expressing Gratitude
"Unclear": User query seems very short, ambigous, incomplete, misleading, very general and can lead to false answer or hallucination
"UnIdentified": You are unable to categories in any of the above Intent Category.

Based on the mentioned eight intent_categories ["Ticket","Feedback","Out of Context","Information Seeking","Greeting & GoodByes","Gratitude","UnIdentified","Other"] choose only one and give it as output.

Input: {query_text}, {context_text}, {history_buffer}

Output: {intent_category}

""" TEMPLATE=""" Based on the the provided learning: {context_text}, chat history: {history_buffer} and user intent: {intent_category} formulate the answer for the user query :{query_text}. """

@cl.on_chat_start async def on_chat_start(): """This function executes only once either when the app is opened or when a new chat session is started"""

print("\nchat start\n")
model = get_model() 

# memory for conversation buffer
memory = ConversationBufferWindowMemory(k=N_MESSAGE_PASSES_IN_HISTORY) 
cl.user_session.set("memory", memory) 

# Session memory
chat_memory = ConversationBufferWindowMemory(k=N_MESSAGE_PASSES_IN_SESSION)
cl.user_session.set("chat_memory", chat_memory)

prompt1 = ChatPromptTemplate.from_template(TEMPLATE)
prompt2= ChatPromptTemplate.from_template(INTENT_TEMPLATE)
chain = (
    RunnableParallel(
        {
            "context_text": retrieve,
            "history_buffer": get_chat_history,
            "session_buffer":get_session_history,
            "query_text": RunnablePassthrough(),
            "intent_category": RunnablePassthrough()
        }

    )
    | prompt1
    | model
    | RunnablePassthrough(key="intent_category")
    | prompt2
    | model
    | StrOutputParser()
)
cl.user_session.set("chain", chain)
TypeError: Expected mapping type as input to ChatPromptTemplate. Received <class 'langchain_core.messages.ai.AIMessageChunk'>.
Can someone plase let me know where am I commiting the mistake?

langchain-ai / langchain