embedding router with conversation history

mhyeonsoo commented 5 months ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.

Example Code

I wrote the code for categorization of prompt.

def prompt_router(input, embeddings, prompt_templates, prompt_embeddings):
    query_embedding = embeddings.embed_query(input["query"])
    similarity = cosine_similarity([query_embedding], prompt_embeddings)[0]
    most_similar = prompt_templates[similarity.argmax()]
    print(most_similar)
    return PromptTemplate.from_template(most_similar)

def main_categorizer(message):
    global memory, entry
    formatted_history = memory.get_history_as_string()

    case1_template = """Description of what case1 does

    Chat History:
    {chat_history}

    Here is a question:
    {query}"""

    case2_template = """Description of what case2 does

    Chat History:
    {chat_history}

    Here is a question:
    {query}"""

    case3_template = """Description of what case3 does

    Chat History:
    {chat_history}

    Here is a question:
    {query}"""

    case4_template = """Description of what case4 does

    Chat History:
    {chat_history}

    Here is a question:
    {query}"""

    prompt_templates = [case1_template, case2_template, case3_template, case4_template]
    prompt_embeddings = embeddings.embed_documents(prompt_templates)

    chain = (
        {"query": RunnablePassthrough()}
        | RunnableLambda(prompt_router)
        | llm
        | StrOutputParser()
    )

Description

Based on the document in https://python.langchain.com/docs/expression_language/cookbook/embedding_router, I've tried to implement embedding router. What I would like to do is, adding conversation history to the case prompts so that they can use historical conversation as well to consider which category the user prompt is.

In here, I have no idea where to put {chat_history} value just like the query being inserted with "query": RunnablePassthrough()

System Info

langchian==0.0.352

Related Components

[X] LLMs/Chat Models
[ ] Embedding Models
[X] Prompts / Prompt Templates / Prompt Selectors
[ ] Output Parsers
[ ] Document Loaders
[ ] Vector Stores / Retrievers
[X] Memory
[ ] Agents / Agent Executors
[ ] Tools / Toolkits
[X] Chains
[ ] Callbacks/Tracing
[ ] Async

dosubot[bot] commented 5 months ago

🤖

Hello @mhyeonsoo! I'm Dosu, a friendly bot here to lend a hand with your LangChain related queries, bugs, and anything else you might need help with. While you're waiting for a human maintainer, feel free to ask me anything. I'm here to assist you. Let's get this conversation rolling!

Based on the information you've provided and the context from the LangChain repository, it seems like you're trying to incorporate the conversation history into the case prompts in your embedding router implementation. Here's how you can do it:

You can modify the PROMPT template to include a new variable for the conversation history.

prompt_template = """Use the following pieces of context and conversation history to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

Context: {context}

Conversation History: {history}

Question: {question}
Helpful Answer:"""
PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "history", "question"]
)

In this updated template, {history} is a placeholder for the conversation history. You would need to pass the conversation history as an argument wherever you're using this PROMPT. You can find more details about this in the LangChain repository.

Regarding your question about RunnablePassthrough(), it is a class in the LangChain framework that passes through inputs unchanged or with additional keys. It behaves almost like the identity function, but it can be configured to add additional keys to the output if the input is a dictionary. You can find more details about this in the LangChain repository.

I hope this helps! If you have any other questions, feel free to ask.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

mhyeonsoo commented 5 months ago

Thanks @dosu-bot,

Can you apply what you suggested on my current code base so that I can more easily understand the context?

Let me share sort of my full code snippets below.

    case1_template = """Description of what case1 does

    Chat History:
    {chat_history}

    Here is a question:
    {query}"""

    case2_template = """Description of what case2 does

    Chat History:
    {chat_history}

    Here is a question:
    {query}"""

    case3_template = """Description of what case3 does

    Chat History:
    {chat_history}

    Here is a question:
    {query}"""

    case4_template = """Description of what case4 does

    Chat History:
    {chat_history}

    Here is a question:
    {query}"""

    llm = VertexAI(model_name='text-bison@001', max_output_tokens=512, temperature=0.2)
    prompt_templates = [case1_template, case2_template, case3_template, case4_template]
    prompt_embeddings = embeddings.embed_documents(prompt_templates)

    def prompt_router(input):
        query_embedding = embeddings.embed_query(input["query"])
        similarity = cosine_similarity([query_embedding], prompt_embeddings)[0]
        most_similar = prompt_templates[similarity.argmax()]
        print("Using case1" if most_similar == case1_template else "Using case2")
        return PromptTemplate.from_template(most_similar)

    chain = (
        {"query": RunnablePassthrough()}
        | RunnableLambda(prompt_router)
        | llm
        | StrOutputParser()
    )

dosubot[bot] commented 5 months ago

🤖

To incorporate the conversation history into your case prompts, you can modify your prompt_router function to accept both the query and the chat history as inputs. Then, you can format your case templates with the chat history and the query.

Here's how you can modify your prompt_router function:

def prompt_router(input):
    chat_history = memory.get_history_as_string()  # Get the chat history
    query = input["query"]
    query_embedding = embeddings.embed_query(query)
    similarity = cosine_similarity([query_embedding], prompt_embeddings)[0]
    most_similar = prompt_templates[similarity.argmax()]
    print("Using case1" if most_similar == case1_template else "Using case2")
    # Format the most similar template with the chat history and the query
    return PromptTemplate.from_template(most_similar.format(chat_history=chat_history, query=query))

And here's how you can modify your chain:

chain = (
    {"query": RunnablePassthrough(), "chat_history": RunnableLambda(lambda _: memory.get_history_as_string())}
    | RunnableLambda(prompt_router)
    | llm
    | StrOutputParser()
)

In this modified chain, a RunnableLambda is used to get the chat history from the memory and pass it to the prompt_router function along with the query. The prompt_router function then formats the most similar case template with the chat history and the query before passing it to the language model.

Please note that this is a basic example and you might need to adjust it according to your specific needs. For example, you might want to handle the case where the chat history is empty or too long to be included in the prompt.

I hope this helps! If you have any further questions, feel free to ask.

For more information on how to use RunnablePassthrough and RunnableLambda, you can refer to the LangChain documentation and the source code:

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

mhyeonsoo commented 5 months ago

Thanks @dosu-bot, that looks great for me.

One more thing I need is, I would like to make this function to only return the most_similar category as a single string, such as in 'most_similar' variable in prompt_router function.

If so, should I make chain like,

chain = (
        {"query": RunnablePassthrough(), "chat_history": RunnableLambda(lambda _: memory.get_history_as_string())}
        | RunnableLambda(prompt_router)
    )

which is processing prompt_router as a final step of the chain, and make the prompt_router function to return most_similar rather than PromptTemplate.from_template(most_similar.format(chat_history=chat_history, query=query))

dosubot[bot] commented 5 months ago

🤖

Sure, you can modify the prompt_router function to return the most_similar category as a single string. Here's how you can do it:

def prompt_router(input):
    query = input["query"]
    chat_history = input["chat_history"]
    query_embedding = embeddings.embed_query(query)
    similarity = cosine_similarity([query_embedding], prompt_embeddings)[0]
    most_similar = prompt_templates[similarity.argmax()]
    return most_similar.format(chat_history=chat_history, query=query)

And then, you can adjust the chain to process prompt_router as the final step:

chain = (
    {"query": RunnablePassthrough(), "chat_history": RunnableLambda(lambda _: memory.get_history_as_string())}
    | RunnableLambda(prompt_router)
)

This way, the prompt_router function will be the final step in the chain and it will return the most_similar category as a single string.

Please note that the RunnableLambda function is used to wrap the prompt_router function so that it can be used in the chain. The RunnablePassthrough function is used to pass the query and chat_history as inputs to the prompt_router function.

I hope this helps! If you have any other questions, feel free to ask.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

mhyeonsoo commented 5 months ago

Thanks!

langchain-ai / langchain