AgentExecutor giving inconsistent results

ragvendra3898 commented 8 months ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain rather than my code.

Example Code

Hi Team, I am using chromadb for uploading documents and then trying to get the answer from db using using Agent but every time it is generating inconsistent results and the probability to generate correct answer is 0.1 so let me know how can I fix this

from langchain.chains import ChatVectorDBChain, RetrievalQA, RetrievalQAWithSourcesChain, ConversationChain
from langchain.agents import initialize_agent, Tool, load_tools, AgentExecutor, ConversationalChatAgent
from langchain.tools import BaseTool, tool

vectordb = connect_chromadb()
search_qa = RetrievalQAWithSourcesChain.from_chain_type(llm=llm, chain_type="stuff", 
                    retriever=vectordb.as_retriever(search_type="mmr", search_kwargs={"filter": filters}), return_source_documents=True, 
                    chain_type_kwargs=digitaleye_templates.qa_summary_kwargs, reduce_k_below_max_tokens=True)

summary_qa = RetrievalQAWithSourcesChain.from_chain_type(llm=llm, chain_type="stuff", 
                         retriever=vectordb.as_retriever(search_type="mmr", search_kwargs={"filter": filters}), 
                         return_source_documents=True, chain_type_kwargs=digitaleye_templates.general_summary_kwargs, 
                         reduce_k_below_max_tokens=True)

detools = [
        Tool(
            name = "QA Search",
            func=search_qa,
            description="Useful for when you want to search a document store for the answer to a question based on facts contained in those documents.",
            return_direct=True,
        ),
        Tool(
            name = "General Summary",
            func=summary_qa,
            description="Useful for when you want to summarize a document for the answer to a question based on facts contained in those documents.",
            return_direct=True,
        ),
    ]

agent = initialize_agent(tools=detools, llm=llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
                                verbose=True,
                                agent_kwargs={
                                    'prefix':PREFIX,
                                    #'format_instructions':FORMAT_INSTRUCTIONS,
                                    #'suffix':SUFFIX,
                                    "input_variables": ["input","agent_scratchpad"],
                                },
                                #prefix=PREFIX,
                                #format_instructions=FORMAT_INSTRUCTIONS,
                                #suffix=SUFFIX,
                                max_iterations=3,
                                return_intermediate_steps=False,
                                early_stopping_method="generate",
                                handle_parsing_errors="Check you output and make sure it conforms! Do not output an action and a final answer at the same time.",
                                #streaming=True,
                             )
answer = agent.run({"input": "summarize mrileftkneereport.pdf"})
print(answer)

#here are the samples of answer which I received

answer1 = "As an AI model, I’m unable to open or read files such as “mrileftkneereport.pdf”. However, I can help you generate a 
                   summary or answer specific questions based on the text content you provide from the report. Please copy and paste 
                   the text from the report, and I’ll do my best to assist you."

answer2 = "As an AI, I’m unable to access or analyze specific files such as “mrileftkneereport.pdf”. However, I can guide you on 
                  how to interpret a radiology report based on the guidelines provided.
                  For a general summary, you would need to identify the patient’s name, date of exam, type of exam requested, and 
                  radiologist name from the report. The first paragraph of the summary should include the reason for the screening, 
                  screening time, protocol, patient background, and comparison date.
                 The second paragraph should detail the specific findings, including the number of nodules detected, their precise 
                 position and size, findings on lungs, heart, pleura, coronary artery calcification, mediastinum/hilum/axilla, and other 
                 findings.
                 The third paragraph should provide conclusions, including the overall lung-rads category, management 
                 recommendation and follow-up date, based on the lesion.
               If there are incidental findings, these should be introduced in the fourth paragraph.
               For a specific question, you would need to locate the relevant information in the report. For example, if the question is 
               about the size of the largest lung nodule, you would need to find the section of the report that details the nodules and 
               their sizes.
               If the question is about the radiologist’s recommendation for managing a lung nodule, you would need to find the 
               section of the report that provides the conclusions and recommendations.
              Remember to provide the answer in a clear and accurate manner, and provide any relevant context if necessary.
              If the question contains both a general summary request and a specific question, prioritize answering the specific 
              question.
              Avoid using legal jargon in your responses and answer what you can based on the information available in the report. If 
              you are unable to answer the question, explain why, such as the information not being included in the report."

answer3 = "The question appears to be the name of the file, “mrileftkneereport.pdf”, rather than a specific question or a request 
                   for a general summary of the radiology report. Therefore, it’s not clear what information is being requested. Please 
                   provide a specific question or request for a general summary of the radiology report."

answer4 = "The question provided is “mrileftkneeereport.pdf” which does not specify whether a general summary or a specific 
                   detail is required from the MRI report. However, I can provide a general summary based on the information given in 
                  the report.

                   Summary Report

                  Patient’s Name: Regina Doe
                  Date of Exam: Not specified
                  Type of Exam: MRI Left Knee
                  Radiologist Name: Ross Banner, MD

                  The MRI screening was performed due to a left knee twisting injury that Regina Doe experienced on 1/1/2013 The 
                  screening was conducted using a standard protocol on a .3 Tesla Open MRI system, without IV contrast. There were 
                  no 
                  previous scans available for comparison.

                 The MRI findings revealed a complete mid substance disruption of the anterior cruciate ligament, with its fibers 
                 fibrillating within the joint. This has resulted in the buckling of the posterior crucial ligament. There is also edema 
                 relative to the medial collateral ligament, indicating a grade 1 injury. The lateral collateral ligament complex, including 
                 the iliotibial band, biceps femoris tendon, fibular collateral ligament, and popliteus muscle and tendon, are thought to 
                 be intact. The menisci and patella appear to be in good condition, although there is posterior meniscal capsular 
                 junction 
                edema. A large suprapatellar bursal effusion and mild reactive synovitis were also noted. The osseous structures and 
                periarticular soft tissues were largely unremarkable, except for a deepened lateral condylar patellar sulcus of the femur.

             The conclusions drawn from the MRI report include a complete full-thickness disruption of the anterior cruciate 
             ligament, an associated osseous contusion of the lateral condylar patellar sulcus (indicative of a pivot shift injury), and a 
             grade 1 MCL complex injury. No other associated injuries were identified."

where answer4 ls correct but why I am not getting it consistently.

Please help me on this, I will be thankful to you.

Error Message and Stack Trace (if applicable)

No response

Description

I am trying to get answers from chromadb vectorstore using Agent but every time it is producing inconsistent results.

System Info

System Information

OS: Windows OS Version: 10.0.22621 Python Version: 3.9.11 (main, Mar 30 2022, 02:45:55) [MSC v.1916 64 bit (AMD64)]

Package Information

langchain_core: 0.1.10 langchain: 0.1.0 langchain_community: 0.0.12 langserve: Not Found

dosubot[bot] commented 8 months ago

🤖

Thank you for providing a detailed description of your issue. It seems like the inconsistency in the results might be due to the nature of the underlying language model or the retrieval process from the chromadb vectorstore. However, to better understand and diagnose the issue, I would need a bit more information.

Could you please provide more details about the llm object that you're using? What kind of language model is it and how is it configured?
Are there any specific patterns you've noticed in the inconsistent results? For example, does the inconsistency occur with certain types of queries or documents?
Could you share more about the filters you're using in the search_kwargs? What kind of documents are you trying to retrieve from the chromadb vectorstore?
How are you handling the PREFIX and FORMAT_INSTRUCTIONS in your agent_kwargs? Are they static or dynamic based on the input?

The more information you can provide, the better I can assist you in resolving this issue.

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

ragvendra3898 commented 8 months ago

Sure, here is the info about llm object I am using

from langchain.prompts import PromptTemplate from langchain_openai import ChatOpenAI llm = ChatOpenAI(model_name='gpt-4', temperature=0)

also filters is None filters = None and here is the prompt which I am using for general summary

general_summary_prompt_template = """ Context: {summaries} Question: {question}

Task:

Summarize the content of the file with the filename mentioned in the context.

If the answer is not provided in the context, start your answer with "I'm sorry, I couldn't find that information in the context." Provide a concise and informative summary of the main points in the document. Include key information and insights found in the file. If there is a URL link in the context, provide it in your response.

Notes:

Please use natural language and avoid using jargon in your responses. Don't say things like "The provided text does not contain enough information", just answer what you can based on the context. If you are unable to answer the question, please explain why. """

general_summary_prompt = PromptTemplate(input_variables=["summaries", "question"],template=general_summary_prompt_template) general_summary_kwargs = {"prompt": general_summary_prompt}

Siddhijain16 commented 8 months ago

I am also facing same issue most of time it giving me generic answer not taking my passed document as a context. @ragvendra3898 Any update on this part.

langchain-ai / langchain