langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
92.06k stars 14.65k forks source link

load_qa_chain with map_rerank by local huggingface model #3970

Closed flaviadeutsch closed 9 months ago

flaviadeutsch commented 1 year ago

I use the huggingface model locally and run the following code:

chain = load_qa_chain(llm=chatglm, chain_type="map_rerank", return_intermediate_steps=True, prompt=PROMPT)
chain({"input_documents": search_docs_Documents, "question": query}, return_only_outputs=True)

The error is as follows:

─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /tmp/ipykernel_274378/983731820.py:2 in <module>                                                 │
│                                                                                                  │
│ [Errno 2] No such file or directory: '/tmp/ipykernel_274378/983731820.py'                        │
│                                                                                                  │
│ /tmp/ipykernel_274378/14951549.py:11 in answer_docs                                              │
│                                                                                                  │
│ [Errno 2] No such file or directory: '/tmp/ipykernel_274378/14951549.py'                         │
│                                                                                                  │
│ /home/hysz/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/base.py:116 in   │
│ __call__                                                                                         │
│                                                                                                  │
│   113 │   │   │   outputs = self._call(inputs)                                                   │
│   114 │   │   except (KeyboardInterrupt, Exception) as e:                                        │
│   115 │   │   │   self.callback_manager.on_chain_error(e, verbose=self.verbose)                  │
│ ❱ 116 │   │   │   raise e                                                                        │
│   117 │   │   self.callback_manager.on_chain_end(outputs, verbose=self.verbose)                  │
│   118 │   │   return self.prep_outputs(inputs, outputs, return_only_outputs)                     │
│   119                                                                                            │
│                                                                                                  │
│ /home/hysz/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/base.py:113 in   │
│ __call__                                                                                         │
│                                                                                                  │
│   110 │   │   │   verbose=self.verbose,                                                          │
│   111 │   │   )                                                                                  │
│   112 │   │   try:                                                                               │
│ ❱ 113 │   │   │   outputs = self._call(inputs)                                                   │
│   114 │   │   except (KeyboardInterrupt, Exception) as e:                                        │
│   115 │   │   │   self.callback_manager.on_chain_error(e, verbose=self.verbose)                  │
│   116 │   │   │   raise e                                                                        │
│                                                                                                  │
│ /home/hysz/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_document │
│ s/base.py:75 in _call                                                                            │
│                                                                                                  │
│    72 │   │   docs = inputs[self.input_key]                                                      │
│    73 │   │   # Other keys are assumed to be needed for LLM prediction                           │
│    74 │   │   other_keys = {k: v for k, v in inputs.items() if k != self.input_key}              │
│ ❱  75 │   │   output, extra_return_dict = self.combine_docs(docs, **other_keys)                  │
│    76 │   │   extra_return_dict[self.output_key] = output                                        │
│    77 │   │   return extra_return_dict                                                           │
│    78                                                                                            │
│                                                                                                  │
│ /home/hysz/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_document │
│ s/map_rerank.py:97 in combine_docs                                                               │
│                                                                                                  │
│    94 │   │                                                                                      │
│    95 │   │   Combine by mapping first chain over all documents, then reranking the results.     │
│    96 │   │   """                                                                                │
│ ❱  97 │   │   results = self.llm_chain.apply_and_parse(                                          │
│    98 │   │   │   # FYI - this is parallelized and so it is fast.                                │
│    99 │   │   │   [{**{self.document_variable_name: d.page_content}, **kwargs} for d in docs]    │
│   100 │   │   )                                                                                  │
│                                                                                                  │
│ /home/hysz/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/llm.py:192 in    │
│ apply_and_parse                                                                                  │
│                                                                                                  │
│   189 │   ) -> Sequence[Union[str, List[str], Dict[str, str]]]:                                  │
│   190 │   │   """Call apply and then parse the results."""                                       │
│   191 │   │   result = self.apply(input_list)                                                    │
│ ❱ 192 │   │   return self._parse_result(result)                                                  │
│   193 │                                                                                          │
│   194 │   def _parse_result(                                                                     │
│   195 │   │   self, result: List[Dict[str, str]]                                                 │
│                                                                                                  │
│ /home/hysz/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/llm.py:198 in    │
│ _parse_result                                                                                    │
│                                                                                                  │
│   195 │   │   self, result: List[Dict[str, str]]                                                 │
│   196 │   ) -> Sequence[Union[str, List[str], Dict[str, str]]]:                                  │
│   197 │   │   if self.prompt.output_parser is not None:                                          │
│ ❱ 198 │   │   │   return [                                                                       │
│   199 │   │   │   │   self.prompt.output_parser.parse(res[self.output_key]) for res in result    │
│   200 │   │   │   ]                                                                              │
│   201 │   │   else:                                                                              │
│                                                                                                  │
│ /home/hysz/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/llm.py:199 in    │
│ <listcomp>                                                                                       │
│                                                                                                  │
│   196 │   ) -> Sequence[Union[str, List[str], Dict[str, str]]]:                                  │
│   197 │   │   if self.prompt.output_parser is not None:                                          │
│   198 │   │   │   return [                                                                       │
│ ❱ 199 │   │   │   │   self.prompt.output_parser.parse(res[self.output_key]) for res in result    │
│   200 │   │   │   ]                                                                              │
│   201 │   │   else:                                                                              │
│   202 │   │   │   return result                                                                  │
│                                                                                                  │
│ /home/hysz/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/output_parsers/regex.py │
│ :28 in parse                                                                                     │
│                                                                                                  │
│   25 │   │   │   return {key: match.group(i + 1) for i, key in enumerate(self.output_keys)}      │
│   26 │   │   else:                                                                               │
│   27 │   │   │   if self.default_output_key is None:                                             │
│ ❱ 28 │   │   │   │   raise ValueError(f"Could not parse output: {text}")                         │
│   29 │   │   │   else:                                                                           │
│   30 │   │   │   │   return {                                                                    │
│   31 │   │   │   │   │   key: text if key == self.default_output_key else ""                     │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Could not parse output: 
aravind-selvam commented 1 year ago

I wanted to share that I am also encountering the same issue with the load_qa_chain function when using the map_rerank parameter with a local HuggingFace model. Waiting for any fix from developers

cmazzoni87 commented 1 year ago

Experiencing the same issue, local huggingface embedding model used is 'entence-transformers/all-mpnet-base-v2' base model is Dolly

keanduffey commented 1 year ago

same for me cmazzoni87

cmazzoni87 commented 1 year ago

@keanduffey its exactly the same lines as the ones from @flaviadeutsch

Raji635 commented 1 year ago

I also facing the exact same issue when using load_qa_with_sources_chain with map_rerank by openai model. do let me know , if any knows how to resolve it.

andysingal commented 1 year ago

Having the same issue too... Tried Outputparser from Langchain but still was not able to resolve the issue

anggoro-yn commented 1 year ago

I can run the load_qa_chain with map_rerank on google colab but I failed on my local jupyter notebook.

The error msg: ValueError: Could not parse output: AI assistants can help provide personalized offerings and tailored messaging to customers, enrich service interactions, and predict customer needs based on profile data. Score: 100

Does anyone know the solution?

anggoro-yn commented 1 year ago

With map_reduce, it work perfectly. SOmehow it fails on map_rerank.

keanduffey commented 1 year ago

Recently got a similar "Could not parse output" error trying to implement an agent again using the Dolly v2 LLM. After reading up on that issue it seems like maybe this is just related to lower power LLMs that cannot produce the required text format to match the expected prompting, I'm wondering if that's all that's going on with rerank too. Although I would have thought @Raji635 using OpenAI's model wouldn't have had the same issue though if that was all it is.

Object-Oriented101 commented 1 year ago

Can confirm same issue with Falcon-7b-instruct on Jupyter Notebook

rjtmehta99 commented 1 year ago

Happening with Flan-T5-large too

dosubot[bot] commented 9 months ago

Hi, @flaviadeutsch! I'm Dosu, and I'm here to help the LangChain team manage our backlog. I wanted to let you know that we are marking this issue as stale.

Based on my understanding, the issue you reported is related to the code trying to access a file that doesn't exist, resulting in a No such file or directory error. It seems that several other users, including aravind-selvam, cmazzoni87, and Raji635, have also encountered the same issue when using different models. Some users have attempted to resolve the issue by using Outputparser from Langchain, but unfortunately, the issue persists. Additionally, users like anggoro-yn and keanduffey have shared their experiences and possible explanations for the error.

Before we proceed, I wanted to confirm if this issue is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your understanding, and please don't hesitate to reach out if you have any further questions or concerns.

Best regards, Dosu