ValidationError: 1 validation error for MapRerankDocumentsChain

Lin-jun-xiang commented 1 year ago

System Info

Langchain-0.0.215
Python3.8.6

Who can help?

@hwchase17

Information

[ ] The official example notebooks/scripts
[ ] My own modified scripts

Related Components

[ ] LLMs/Chat Models
[ ] Embedding Models
[X] Prompts / Prompt Templates / Prompt Selectors
[ ] Output Parsers
[ ] Document Loaders
[X] Vector Stores / Retrievers
[ ] Memory
[ ] Agents / Agent Executors
[ ] Tools / Toolkits
[ ] Chains
[ ] Callbacks/Tracing
[ ] Async

Reproduction

The chain type in RetrievalQA.from_chain_type():

stuff: work successfully
refine: not work unless uses correctly naming or parameter. #6912
map_rerank: not work
map_reduce: not work

All the error of each type like:

code:

prompt_template = """
Use the following pieces of context to answer the question, if you don't know the answer, leave it blank don't try to make up an answer.

{context}

Question: {question}

Answer in JSON representations
"""

QA_PROMPT = PromptTemplate(
    template=prompt_template,
    input_variables=['context', 'question']
)
chain_type_kwargs = {
    'prompt': QA_PROMPT,
    'verbose': True
}

docs = PyMuPDFLoader('file.pdf').load()
splitter = RecursiveCharacterTextSplitter(
    chunk_size=chunk_size,
    chunk_overlap=chunk_overlap
)
docs = splitter.split_documents(document)

embeddings = OpenAIEmbeddings()
db = Chroma.from_documents(
    documents=docs,
    embedding= embeddings
)

qa_cahin = RetrievalQA.from_chain_type(
    llm=OpenAI(temperature=0.2),
    chain_type='map_rerank',
    retriever=db.as_retriever(),
    chain_type_kwargs=chain_type_kwargs
)

result:

ValidationError                           Traceback (most recent call last)
[c:\Users\JunXiang\AppData\Local\Programs\Python\Python38\lib\site-packages\langchain\chains\retrieval_qa\base.py](file:///C:/Users/JunXiang/AppData/Local/Programs/Python/Python38/lib/site-packages/langchain/chains/retrieval_qa/base.py) in from_chain_type(cls, llm, chain_type, chain_type_kwargs, **kwargs)
     89         """Load chain from chain type."""
     90         _chain_type_kwargs = chain_type_kwargs or {}
---> 91         combine_documents_chain = load_qa_chain(
     92             llm, chain_type=chain_type, **_chain_type_kwargs
     93         )

[c:\Users\JunXiang\AppData\Local\Programs\Python\Python38\lib\site-packages\langchain\chains\question_answering\__init__.py](file:///C:/Users/JunXiang/AppData/Local/Programs/Python/Python38/lib/site-packages/langchain/chains/question_answering/__init__.py) in load_qa_chain(llm, chain_type, verbose, callback_manager, **kwargs)
    236             f"Should be one of {loader_mapping.keys()}"
...
[c:\Users\JunXiang\AppData\Local\Programs\Python\Python38\lib\site-packages\pydantic\main.cp38-win_amd64.pyd](file:///C:/Users/JunXiang/AppData/Local/Programs/Python/Python38/lib/site-packages/pydantic/main.cp38-win_amd64.pyd) in pydantic.main.BaseModel.__init__()

ValidationError: 1 validation error for MapRerankDocumentsChain
__root__
  Output parser of llm_chain should be a RegexParser, got None (type=value_error)

Expected behavior

Does not crash, when i try to run it.

dev2049 commented 1 year ago

not able to recreate with this

from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

db = FAISS.from_texts(["foo"], OpenAIEmbeddings())

qa_cahin = RetrievalQA.from_chain_type(
    llm=OpenAI(temperature=0.2),
    chain_type='map_rerank',
    retriever=db.as_retriever(),
    chain_type_kwargs={}
)

off master, py3.11. anything im missing? what's in chain_type_kwargs?

Lin-jun-xiang commented 1 year ago

I've update the code.

exceedzhang commented 1 year ago

map_rerank not work.

ChrisDelClea commented 1 year ago

Same Error here.

wozwdaqian commented 1 year ago

使用RetrievalQA.from_chain_type会出现Output parser of llm_chain should be a RegexParser, got None (type=value_error)的问题。但是我按照文档使用load_qa_chain加载就没有问题了

dosubot[bot] commented 9 months ago

Hi, @Lin-jun-xiang,

I'm helping the LangChain team manage our backlog and am marking this issue as stale. It looks like you raised an issue regarding a validation error for the MapRerankDocumentsChain in the Langchain system. Several users have attempted to recreate the issue and encountered the same error, with one user providing a screenshot of the error message. The issue is still ongoing and has not been resolved.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

I appreciate your attention to this matter.

Best, Dosu

shenyaojun commented 9 months ago

I encountered this same error today. Python is 3.10.9。it is ok if change chain_type to "stuff".

dosubot[bot] commented 9 months ago

@baskaryan, could you please assist with the validation error for the MapRerankDocumentsChain in the Langchain system? The user encountered the same error today with Python 3.10.9 and provided a screenshot of the error message. Changing the chain_type to "stuff" resolved the issue. Thank you!

shenyaojun commented 9 months ago

langchain：0.0.352

naveenfaclon commented 7 months ago

ValidationError: 1 validation error for StuffDocumentsChain

dosubot[bot] commented 7 months ago

🤖

Based on the error message you're encountering, it seems that the output_parser of the llm_chain is not an instance of RegexParser or it is None. In the MapRerankDocumentsChain class, there is a root validator method validate_llm_output that checks if the output_parser of the llm_chain is an instance of RegexParser. If it is not, it raises a ValueError.

To resolve this error, you need to ensure that the output_parser of the llm_chain is an instance of RegexParser. Here is an example of how to do this:

from langchain.chains import LLMChain
from langchain.output_parsers.regex import RegexParser

output_parser = RegexParser(
    regex=r"(.*?)\nScore: (.*)",
    output_keys=["answer", "score"],
)
llm_chain = LLMChain(llm=llm, prompt=prompt, output_parser=output_parser)

In this example, output_parser is an instance of RegexParser, and it is used when creating an instance of LLMChain. This should prevent the validation error from being raised. You can find more details about this in the MapRerankDocumentsChain source code.

Regarding the chain_type issue, it seems that different chain types require different keywords in their prompt templates. For example, in the Refine chain, the input variables are question_prompt and refine_prompt. You can find more details about this in the issue #6590.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources

#### About Dosu

This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.

langchain-ai / langchain