[Question]: Customizing chat engines

lucijaSub commented 1 year ago

Question Validation

[X] I have searched both the documentation and discord for an answer.

Question

I have an issue with writing custom prompts for a chat engine. I have a vector index with my custom knowlage. I am using CondenseQuestionChatEngine as my query engine. I want to add a custom prompt that will tell my chat bot "You are an expert on X topic, answer correctly and don't hallucinate" but I am confused on how to do it.

When I do it like this:

query_engine_base = self.vector_index.as_query_engine(text_qa_template=qa_template)

self.base_chat_engine = CondenseQuestionChatEngine.from_defaults(
    query_engine=query_engine_base,
    verbose=True,
    service_context=self.service_context)

The bot answers correctly, but it doesn't remember chat history. How is this supposed to be done? I searched in the documentation and it has been confusing, I can't find a solution that does what I want.

dosubot[bot] commented 1 year ago

🤖

Hello @lucijaSub,

I'm an AI bot here to assist you while we wait for a human maintainer. I can help you solve bugs, answer questions, and guide you on becoming a contributor. I'm currently reviewing your issue and I'll provide you with a full answer shortly. Let's get this sorted out!

dosubot[bot] commented 1 year ago

🤖

Based on your question, it seems you want to add a custom prompt to the CondenseQuestionChatEngine and also retain the chat history. Here's how you can achieve this:

To add a custom prompt, you can modify the ChatMessage objects in the TEXT_QA_PROMPT_TMPL_MSGS and CHAT_REFINE_PROMPT_TMPL_MSGS lists. These lists are used to create the ChatPromptTemplate objects CHAT_TEXT_QA_PROMPT and CHAT_REFINE_PROMPT, which are then used in the as_query_engine method of the index object. Here is an example:

from llama_index.chat_engine import ChatMessage, MessageRole
from llama_index.prompts import ChatPromptTemplate

TEXT_QA_PROMPT_TMPL_MSGS = [
    ChatMessage(
        role=MessageRole.SYSTEM,
        content=(
            "You are an expert on the topic of llamas. "
            "Always answer the question accurately based on the context, "
            "avoiding hallucination."
        ),
    ),
    ChatMessage(
        role=MessageRole.USER,
        content=(
            "Chat history is as follows:\n"
            "---------------------\n"
            "{chat_history}\n"
            "---------------------\n"
            "Context information is below.\n"
            "---------------------\n"
            "{context_str}\n"
            "---------------------\n"
            "Given the chat history and context information, "
            "answer the question: {query_str}\n"
        ),
    ),
]
CHAT_TEXT_QA_PROMPT = ChatPromptTemplate(message_templates=TEXT_QA_PROMPT_TMPL_MSGS)

In this example, {chat_history} is a placeholder for the chat history. You would need to replace this with the actual chat history when using the template.

To retain the chat history, CondenseQuestionChatEngine uses the _memory attribute, which is an instance of ChatMemoryBuffer. When a new message is sent, the engine generates a standalone question from the conversation context and the last message. It then queries the query engine for a response. Both the user's message and the assistant's response are recorded in the chat history by calling the _memory.put method with a ChatMessage object. The chat history can be retrieved using the chat_history property, which calls the _memory.get_all method.

Here are some references that might be helpful:

I hope this helps! If you have any other questions, feel free to ask.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

ravi03071991 commented 1 year ago

@lucijaSub did you check low level composition in chat engine - https://docs.llamaindex.ai/en/stable/module_guides/deploying/chat_engines/usage_pattern.html . Should give an idea on how to customize your prompt.

logan-markewich commented 1 year ago

@lucijaSub you can add a system prompt

self.base_chat_engine = CondenseQuestionChatEngine.from_defaults(
    query_engine=query_engine_base,
    verbose=True,
    service_context=self.service_context,
    system_prompt="You are a ..."
)

ram510 commented 1 year ago

Did not answer the second part of the question, The bot answers correctly, but it doesn't remember chat history. How is this supposed to be done? I searched in the documentation and it has been confusing, I can't find a solution that does what I want.?

logan-markewich commented 1 year ago

@ram510 if you are using a chat engine (not a query engine) it should be keeping track.

You can confirm by checking chat_engine.memory.get() -- this will fetch the current chat window from the buffer

chat_engine.chat_history will fetch the full history, regardless of token limits

ram510 commented 1 year ago

Thanks @logan-markewich

lucijaSub commented 1 year ago

@ram510 if you are using a chat engine (not a query engine) it should be keeping track.

You can confirm by checking chat_engine.memory.get() -- this will fetch the current chat window from the buffer

chat_engine.chat_history will fetch the full history, regardless of token limits

Thanks for the answer. I tried this and you are right, all the messages are kept in chat history, but for some reason when I ask the bot what was my last question he doesn't know how to answer. Maybe it uses the history just to formulate a question and not for the context for answering the query? Is this how it is supposed to be?

ambikeshsingh commented 6 months ago

why chat engine is giving response out side the context . Even I have done system prompting as well . My chat mode is "Context". please help.

YogeshMirpuri commented 2 months ago

why chat engine is giving response out side the context . Even I have done system prompting as well . My chat mode is "Context". please help.

i am also facing the same problem, did you find a solution?

run-llama / llama_index