ConversationalRetrievalChain with ChatGoogleGenerativeAI performs poorly

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.

Example Code

I have omitted as much as possible irrelevant portions:


def get_text(docs):
    if not docs:
        return None
    all_text = ""
    for doc in docs:
        reader = PdfReader(doc)
        for page in reader.pages:
            all_text += page.extract_text()
    return all_text

def get_chunks(text):
    splitter = CharacterTextSplitter(
        separator='\n',
        chunk_size=1000,
        chunk_overlap=200,
    )
    chunks = splitter.split_text(text)
    return chunks

def get_vectorstore(chunks, model='models/embedding-001'):
    client = qdrant_client.QdrantClient(
        os.environ['QDRANT_HOST'],
        api_key=os.environ["QDRANT_API_KEY"],
    )
    vectors_config = qdrant_client.http.models.VectorParams(
        size=768,
        distance=qdrant_client.http.models.Distance.COSINE
    )
    client.recreate_collection(
        collection_name=os.environ["QDRANT_COLLECTION"],
        vectors_config=vectors_config
    )
    embeddings = GoogleGenerativeAIEmbeddings(model=model)
    vectorstore = Qdrant(
        client=client, collection_name=os.environ["QDRANT_COLLECTION"],
        embeddings=embeddings,
    )
    vectorstore.add_texts(chunks)
    return vectorstore

def get_conversation_chain(vectorstore):
    llm = ChatGoogleGenerativeAI(model="gemini-pro", convert_system_message_to_human=True, temperature=0)
    memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
    chain = ConversationalRetrievalChain.from_llm(
        llm=llm,
        retriever=vectorstore.as_retriever(),
        memory=memory,
    )
    return chain

docs = st.file_uploader("Upload your documents:", accept_multiple_files=True, type=["pdf", "txt", "csv"])
user_input = "Some question"
if st.button("Proess"):
    with st.spinner("Processing..."):
        text = get_text(docs)
        chunks = get_chunks(text)
        vectorstore = get_vectorstore(chunks)
        chain = get_conversation_chain(vectorstore)
        response = chain({'question': user_input})

Description

I'm trying to use ConversationalRetrievalChain with the ChatGoogleGenerativeAI integration. This seems to perform rather poorly in several scenarios involving PDF documents, with 2 issues frequently arising:

The model seems to default quite often to "I don't know" or "I cannot find the information in the context provided", despite the fact that the context seems to have the requested information within (so it's unlikely to be a problem with the vector DB).
Some of the LangChain-built system prompts (converted to human messages via the recently-added hyperparameter) which ask the model to translate something to the "original language" completely confuse it, making it output its response in a random language. I have tried using the prompt created by LangChain with the Google API directly, and the response is also bad. Example prompt created by LangChain from my conversation: "Human: Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.\n\nChat History:\n\nHuman: what does the third amendment say?\nAssistant: I do not have access to the context necessary to answer that question.\nFollow Up Input: what does the second amendment say?\nStandalone question:". This outputs a message in a random language on both the Google Gemini API and LangChain.

Example document: Bill of rights. Example questions: (see example prompt above).

System Info

langchain==0.0.351 langchain-community==0.0.4 langchain-core==0.1.1 langchain-experimental==0.0.47 langchain-google-genai==0.0.5

Related Components

[X] LLMs/Chat Models
[ ] Embedding Models
[ ] Prompts / Prompt Templates / Prompt Selectors
[ ] Output Parsers
[ ] Document Loaders
[ ] Vector Stores / Retrievers
[ ] Memory
[ ] Agents / Agent Executors
[ ] Tools / Toolkits
[X] Chains
[ ] Callbacks/Tracing
[ ] Async

🤖

Hello @andrei-rusu! 🖖

I'm Dosu, a bot designed to lend a hand with bugs, answer your questions, and guide you on becoming a contributor. Even though I'm not a human, I'll do my best to assist you while we wait for a human maintainer. Looking forward to our collaboration!

Thank you for your detailed report, andrei-rusu. I understand that you're experiencing issues with the ConversationalRetrievalChain when using the ChatGoogleGenerativeAI integration. Let's try to address your concerns one by one.

Regarding the model defaulting to "I don't know" or "I cannot find the information in the context provided", this could be due to several factors. One possibility is that the model is not able to correctly interpret the context provided. This could be due to the complexity of the text, the way the text is structured, or the specific language used in the text. Another possibility is that the model is not able to correctly map the user's question to the relevant context. This could be due to the way the question is phrased, or the specific language used in the question. It's also possible that there's an issue with the way the ConversationalRetrievalChain is handling the context and the question. Without more information, it's hard to say for sure what the issue might be.
Regarding the issue with the LangChain-built system prompts confusing the model, this could be due to the way the prompts are structured. The example prompt you provided seems to be asking the model to rephrase a question in its "original language", which could be causing the model to output its response in a random language. It's possible that there's an issue with the way the ConversationalRetrievalChain is generating these prompts. Again, without more information, it's hard to say for sure what the issue might be.

In order to better understand and address these issues, it would be helpful if you could provide more information. Specifically, it would be helpful to know more about the specific scenarios in which these issues are occurring, the specific texts and questions that are causing these issues, and any other relevant information that could help us understand what's going wrong.

In the meantime, you might want to try experimenting with different settings for the ConversationalRetrievalChain and the ChatGoogleGenerativeAI integration, to see if you can find a configuration that works better for your specific use case. You might also want to try using different models, to see if they perform better.

I hope this helps, and I look forward to hearing more from you so we can better address these issues.

Sources

libs/langchain/langchain/agents/agent_toolkits/conversational_retrieval/openai_functions.py

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

I was able to replicate the second issue on a simpler test case:

embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
vectorstore = DocArrayInMemorySearch.from_texts(
    ["Gemini Pro is a Large Language Model was made by GoogleDeepMind",
     "Gemini can be either a star sign or a name of a series of language models",
     "A Language model is trained by predicting the next token",
     "LLMs can easily do a variety of NLP tasks as well as text generation"],
    embedding=embeddings
)
llm = ChatGoogleGenerativeAI(model='gemini-pro', convert_system_message_to_human=True, temperature=0.7)
memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True)
chain = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=vectorstore.as_retriever(),
    memory=memory,
)
chain.invoke({"question": "Who made Gemini Pro?"}) # works well
chain.invoke({"question": "What is Gemini?"}) # answers in a random language

As stated previously, the issue also appears when using a similar prompt with the Google API directly:

prompt = """
Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its input language.\n\nChat History:\n\nHuman: what is your name?\nAssistant: My name is Bard.\nFollow Up Input: what is Bard?\nStandalone question:
"""
import google.generativeai as genai
model = genai.GenerativeModel('gemini-pro')
resp = model.generate_content(prompt)
Markdown(resp.text) # will give answer in a random language

I believe the issue lies in this follow-up prompt, probably in the word "original". I reckon "its original language" is wrongly interpreted by the model to mean "an original language" instead of the original input language. Replacing "original" with "input" seems to fix this issue. For anyone wondering how to achieve this without modifying the source code:

chain.question_generator.prompt.template = chain.question_generator.prompt.template.replace("original", "input")

langchain-ai / langchain