Issue: Making sure `ConversationalRetrievalChain` only answer based on the retriever information

eduardoscamargo commented 10 months ago

Issue you'd like to raise.

Hi,

I have been learning LangChain for the last month and I have been struggling in the last week to "guarantee" ConversationalRetrievalChain only answers based on the knowledge added on embeddings. I don't know if I am missing some LangChain configuration or if it is just a matter of tuning my prompt. I will add my code here (simplified, not the actual one, but I will try to preserve everything important).

chat = AzureChatOpenAI(        
        deployment_name="chat",
        model_name="gpt-3.5-turbo",
        openai_api_version=os.getenv('OPENAI_API_VERSION'),
        openai_api_key=os.getenv('OPENAI_API_KEY'),
        openai_api_base=os.getenv('OPENAI_API_BASE'),
        openai_api_type="azure",
        temperature=0
      )

embeddings = OpenAIEmbeddings(deployment_id="text-embedding-ada-002", chunk_size=1)

acs = AzureSearch(azure_search_endpoint=os.getenv('AZURE_COGNITIVE_SEARCH_SERVICE_NAME'),
              azure_search_key=os.getenv('AZURE_COGNITIVE_SEARCH_API_KEY'),
              index_name=os.getenv('AZURE_COGNITIVE_SEARCH_INDEX_NAME'),
              embedding_function=embeddings.embed_query)

custom_template = """You work for CompanyX which sells things located in United States.
        If you don't know the answer, just say that you don't. Don't try to make up an answer.
        Base your questions only on the knowledge provided here. Do not use any outside knowledge.
        Given the following chat history and a follow up question, rephrase the follow up question to be a standalone question, in its original language.
        Chat History:
        {chat_history}
        Follow Up Input: {question}
        Standalone question:
      """

CUSTOM_QUESTION_PROMPT = PromptTemplate.from_template(custom_template)

memory = ConversationBufferMemory(memory_key="chat_history", input_key="question", return_messages=True)

qa = ConversationalRetrievalChain.from_llm(
  llm=chat,
  retriever=acs.as_retriever(),
  condense_question_prompt=CUSTOM_QUESTION_PROMPT,
  memory=memory
)

When I ask it something like qa({"question": "What is an elephant?"}) it still answers it, although it is totally unrelated to the knowledge base added to the AzureSearch via embeddings.

I tried different condense_question_prompt, with different results, but nothing near good. I've been reading the documentation and API for the last 3 weeks, but nothing else seems to help in this case. I'd appreciate any suggestions.

Suggestion:

No response

dosubot[bot] commented 10 months ago

🤖

Hello @eduardoscamargo,

I'm Dosu, a bot designed to help you solve bugs, answer questions, and contribute to LangChain. I'm here to assist you while we wait for a human maintainer to join us. I see you've posted an issue and I'm currently reviewing it. I'll provide a full answer shortly. Hang tight!

dosubot[bot] commented 10 months ago

🤖

Based on the code you've shared, it seems like you're using the ConversationalRetrievalChain correctly. However, the issue you're facing might be due to the temperature parameter in your AzureChatOpenAI instance. The temperature parameter controls the randomness of the language model's output. A value of 0 makes the output deterministic, while a higher value makes the output more diverse. If you set temperature to 0, the model will always return the same output for a given input, which might not be what you want in a conversational setting.

Regarding your issue with the model answering questions that are not related to the knowledge base, it's important to note that the ConversationalRetrievalChain uses a two-step process to generate responses:

It uses the retriever to fetch relevant documents based on the question.
It uses the llm (language model) to generate a response based on the retrieved documents and the question.

If the retriever doesn't find any relevant documents, the llm will still try to generate a response based on the question alone. This is likely why you're seeing responses to questions like "What is an elephant?" even though this information is not in your knowledge base.

To prevent the model from generating responses when no relevant documents are found, you can modify the ConversationalRetrievalChain to return a fixed response in such cases. Here's how you can do it:

qa = ConversationalRetrievalChain.from_llm(
  llm=chat,
  retriever=acs.as_retriever(),
  condense_question_prompt=CUSTOM_QUESTION_PROMPT,
  memory=memory,
  response_if_no_docs_found="I don't know"  # Set a fixed response
)

With this modification, the ConversationalRetrievalChain will return "I don't know" if the retriever doesn't find any relevant documents.

I hope this helps! Let me know if you have any other questions.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

eduardoscamargo commented 10 months ago

@dosu-bot , adding the parameter response_if_no_docs_found didn't change the behavior, it continued answering things that were unrelated to the knowledge base. I tried also messing with the temperature and it didn't solve the problem.

dosubot[bot] commented 10 months ago

🤖

Based on the issues and discussions in the LangChain repository, it seems that you can configure LangChain to return answers only from the ingested database, rather than using its pre-trained information. This can be achieved by using the QUESTION_PROMPT and COMBINE_PROMPT templates defined in the map_reduce_prompt.py file.

Here is the relevant code:

QUESTION_PROMPT = PromptTemplate(
    template=question_prompt_template, input_variables=["context", "question"]
)

combine_prompt_template = """Given the following extracted parts of a long document and a question, create a final answer with references ("SOURCES"). 
If you don't know the answer, just say that you don't know. Don't try to make up an answer.
ALWAYS return a "SOURCES" part in your answer.
QUESTION: {question}
=========
{summaries}
=========
FINAL ANSWER:"""
COMBINE_PROMPT = PromptTemplate(
    template=combine_prompt_template, input_variables=["summaries", "question"]
)

In this way, LangChain ensures that the language model only returns answers that are present in the ingested database, rather than using its pre-trained information. You can find more details in the map_reduce_prompt.py file.

As for the process of ingesting a TXT database into a vector one in LangChain, here is a sample code snippet:

from langchain.embedding.openai import OpenAIEmbeddings
from langchain.vectorstores.epsilla import Epsilla
from pyepsilla import vectordb

# Initialize the Epsilla client
client = vectordb.Client()
vectorstore = Epsilla(
    client=client,
    embeddings=OpenAIEmbeddings(),
    db_path="/tmp/vectorstore",
    db_name="langchain_store"
)

# Read the texts from the TXT database
with open('database.txt', 'r') as file:
    texts = file.readlines()

# Add the texts to the vector store
vectorstore.add_texts(texts)

In this example, the OpenAIEmbeddings class is used for embeddings. The texts are read from a TXT file named 'database.txt'. The add_texts method is then used to add these texts to the vector store.

I hope this helps! Let me know if you have any other questions.

Sources:

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

eduardoscamargo commented 10 months ago

The last suggestion is not related to the original question anymore.

dosubot[bot] commented 10 months ago

@baskaryan, I could use your expertise on this matter.