Open SiddharthMurjani opened 4 months ago
To restrict your chatbot strictly to the ingested vector space and prevent it from going to the internet to provide answers, you can customize the text_qa_template
and refine_template
to provide specific instructions. This ensures that the chatbot only responds based on the context of the ingested documents.
Here is an example of how you might customize the templates:
text_qa_template = """
You are a helpful assistant. Answer the question based on the provided context. If the answer is not in the context, respond with "I don't have that information."
Context: {context}
Question: {question}
Answer:
"""
refine_template = """
You are a helpful assistant. Refine the following answer based on the provided context. If the answer is not in the context, respond with "I don't have that information."
Existing Answer: {existing_answer}
Context: {context}
Refined Answer:
"""
# Use these templates in your LLMChain or other relevant components
question_generator = LLMChain(
llm=question_gen_llm,
prompt=text_qa_template,
callback_manager=manager
)
doc_chain = load_qa_chain(
streaming_llm,
chain_type="stuff",
prompt=refine_template,
callback_manager=manager
)
This setup ensures that the chatbot's responses are strictly based on the ingested vector space and do not rely on external internet sources. Additionally, you can further refine the chatbot's behavior by integrating it with a vector store and using a retrieval-based approach, as shown in the example setup for integrating LlamaIndex with LangChain:
def get_chain(vectorstore: VectorStore, question_handler, stream_handler) -> ConversationalRetrievalChain:
manager = AsyncCallbackManager([])
question_manager = AsyncCallbackManager([question_handler])
stream_manager = AsyncCallbackManager([stream_handler])
question_gen_llm = OpenAI(
temperature=0,
verbose=True,
callback_manager=question_manager,
)
streaming_llm = OpenAI(
streaming=True,
callback_manager=stream_manager,
verbose=True,
temperature=0,
)
question_generator = LLMChain(
llm=question_gen_llm, prompt=text_qa_template, callback_manager=manager
)
doc_chain = load_qa_chain(
streaming_llm, chain_type="stuff", prompt=refine_template, callback_manager=manager
)
qa = ConversationalRetrievalChain(
retriever=vectorstore.as_retriever(),
combine_docs_chain=doc_chain,
question_generator=question_generator,
callback_manager=manager,
)
return qa
By following these steps, you can ensure that your chatbot's responses are confined to the ingested vector space and do not include information from external sources [1][2][3].
To continue talking to Dosu, mention @dosu.
I believe llama-index
does not make use of LLMChain and ConversationalRetrievalChain.
It's not accessing the internet? It's either using internal knowledge or hallucinating.
You can try adding an additional system prompt with more instructions for how the llm should act
as_chat_engine(..., system_prompt="...")
Tried this, doesn't seem to work
DEFAULT_TEXT_QA_PROMPT_TMPL = (
"Context information is below.\n"
"---------------------\n"
"{context_str}\n"
"---------------------\n"
"You are a code chatbot\n"
"Given the context information and not prior knowledge, "
"answer the query.\n"
"If the query is generic, do not provide answer \n"
"with your knowledge, just say, I cannot provide answer.\n"
"If no context is retrieved, do not synthesis any answer "
"with previous history or context.\n"
"Straight up say, 'I am not sure.'\n"
"Query: {query_str}\n"
"Answer: "
)
system_prompt = """
You are a helpful code chat assistant.
Your responses should be based solely on the retrieved context provided to you.
If no relevant context is retrieved or if the retrieved context does not contain the necessary information to answer the question, respond with "I'm not sure" or "I don't have enough information to answer that question."
Do not synthesize or generate answers based on general knowledge or information from the internet.
Stick strictly to the information provided in the retrieved context.
"""
chat_engine = vector_index.as_chat_engine(
chat_mode=ChatMode.CONTEXT,
memory=memory,
chat_store=chat_store,
node_postprocessors=[SimilarityPostprocessor(similarity_cutoff=0.81)],
text_qa_template=PromptTemplate(DEFAULT_TEXT_QA_PROMPT_TMPL),
system_prompt=system_prompt,
verbose=True,
response_mode="no_text"
)
"""
Can you please help?
one example
Question: "Where is India?"
bot answer: "India is is a country in South Asia. It is the seventh-largest country by area; the most populous country as of June 2023; and from the time of its independence in 1947, the world's most populous democracy."
Seems like you just need to iterate with the prompt some more 🤷🏻 Different models are better or worse at following prompts
Also, the text_qa_template is not used in a context chat engine
It has a context_template
, or optionally, if the system_prompt is provided, its appended to the context_template
Here's the default context template
DEFAULT_CONTEXT_TEMPLATE = (
"Context information is below."
"\n--------------------\n"
"{context_str}"
"\n--------------------\n"
)
And your system prompt would get appended to the end of that
@logan-markewich We are also seeing the issue described above when using the chat engine.
When we ask an initial question we get the correct answer that includes detail pulled from our index context. If we then ask a follow up question we get a response that is not being bound to the context of the chat engine. If we ask a subsequent question that is not related to the previous, we again get the correct response.
As simple setup that we see this issue with:
# list of `ChatMessage` objects
custom_chat_history = [
ChatMessage(
role=MessageRole.USER,
content="How many orders are over 50?",
),
ChatMessage(
role=MessageRole.ASSISTANT,
content="You have 1,842,541 orders over £50",
)
]
# setup the chat instance
chat_engine = index.as_chat_engine(
chat_mode="context",
system_prompt=db_query_template,
llm=llm
)
# ask follow up question
answer = chat_engine.chat("and also under 2000?", chat_history=custom_chat_history)
This produces an answer that has the correct structure but is referencing a property that doesn't exist anywhere in the chat history or the index context
I've tried tweaking the context_template
but it doesn't seem to influence the behavior of follow up questions. Is this an issue with how the system_prompt
is handled? We have a {context_str}
in the system prompt, is that going to be populated given the way the context_prompt
works?
Or is this linked to us manually managing the chat history?
Question Validation
Question
Hi @logan-markewich,
I want to restrict my chatbot strictly to ingested vector space, how can I achieve this?
I have tried using response
response_synthesizer
andchat_mode
Can you please assist me here?