langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
92.96k stars 14.92k forks source link

load_summarize_chain(chain type : map reduce) is not working for OpenAI models #26424

Closed KalyaniBogala closed 2 weeks ago

KalyaniBogala commented 2 weeks ago

Checked other resources

Example Code

from langchain.chains.summarize import load_summarize_chain from langchain_community.document_loaders import PyPDFLoader from langchain_openai import ChatOpenAI from langchain_google_genai import ChatGoogleGenerativeAI from langchain_text_splitters import CharacterTextSplitter

loader = PyPDFLoader(r"path_to_your_file") docs = loader.load() text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) texts = text_splitter.split_documents(docs)

llm = ChatOpenAI(temperature=0, model_name="gpt-4o-mini",openai_api_key = "openai_api_key")

chain = load_summarize_chain(llm, chain_type="map_reduce") chain.invoke({"input_documents":texts})

Error Message and Stack Trace (if applicable)


TypeError Traceback (most recent call last) Cell In[43], line 15 13 # llm = ChatGoogleGenerativeAI(temperature=0, model="gemini-pro",google_api_key = "") 14 chain = load_summarize_chain(llm, chain_type="map_reduce") ---> 15 chain.invoke({"input_documents":texts})

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\base.py:164, in Chain.invoke(self, input, config, **kwargs) 162 except BaseException as e: 163 run_manager.on_chain_error(e) --> 164 raise e 165 run_manager.on_chain_end(outputs) 167 if include_run_info:

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\base.py:154, in Chain.invoke(self, input, config, **kwargs) 151 try: 152 self._validate_inputs(inputs) 153 outputs = ( --> 154 self._call(inputs, run_manager=run_manager) 155 if new_arg_supported 156 else self._call(inputs) 157 ) 159 final_outputs: Dict[str, Any] = self.prep_outputs( 160 inputs, outputs, return_only_outputs 161 ) 162 except BaseException as e:

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\combine_documents\base.py:138, in BaseCombineDocumentsChain._call(self, inputs, run_manager) 136 # Other keys are assumed to be needed for LLM prediction 137 other_keys = {k: v for k, v in inputs.items() if k != self.input_key} --> 138 output, extra_return_dict = self.combine_docs( 139 docs, callbacks=_run_manager.get_child(), **other_keys 140 ) 141 extra_return_dict[self.output_key] = output 142 return extra_return_dict

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\combine_documents\map_reduce.py:226, in MapReduceDocumentsChain.combine_docs(self, docs, token_max, callbacks, kwargs) 214 def combine_docs( 215 self, 216 docs: List[Document], (...) 219 kwargs: Any, 220 ) -> Tuple[str, dict]: 221 """Combine documents in a map reduce manner. 222 223 Combine by mapping first chain over all documents, then reducing the results. 224 This reducing can be done recursively if needed (if there are many documents). 225 """ --> 226 map_results = self.llm_chain.apply( 227 # FYI - this is parallelized and so it is fast. 228 [{self.document_variable_name: d.page_content, **kwargs} for d in docs], 229 callbacks=callbacks, 230 ) 231 question_result_key = self.llm_chain.output_key 232 result_docs = [ 233 Document(page_content=r[question_result_key], metadata=docs[i].metadata) 234 # This uses metadata from the docs, and the textual results from results 235 for i, r in enumerate(map_results) 236 ]

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\llm.py:250, in LLMChain.apply(self, input_list, callbacks) 248 except BaseException as e: 249 run_manager.on_chain_error(e) --> 250 raise e 251 outputs = self.create_outputs(response) 252 run_manager.on_chain_end({"outputs": outputs})

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\llm.py:247, in LLMChain.apply(self, input_list, callbacks) 242 run_manager = callback_manager.on_chain_start( 243 dumpd(self), 244 {"input_list": input_list}, 245 ) 246 try: --> 247 response = self.generate(input_list, run_manager=run_manager) 248 except BaseException as e: 249 run_manager.on_chain_error(e)

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain\chains\llm.py:138, in LLMChain.generate(self, input_list, run_manager) 136 callbacks = run_manager.get_child() if run_manager else None 137 if isinstance(self.llm, BaseLanguageModel): --> 138 return self.llm.generate_prompt( 139 prompts, 140 stop, 141 callbacks=callbacks, 142 self.llm_kwargs, 143 ) 144 else: 145 results = self.llm.bind(stop=stop, self.llm_kwargs).batch( 146 cast(List, prompts), {"callbacks": callbacks} 147 )

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\language_models\chat_models.py:777, in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, kwargs) 769 def generate_prompt( 770 self, 771 prompts: List[PromptValue], (...) 774 kwargs: Any, 775 ) -> LLMResult: 776 prompt_messages = [p.to_messages() for p in prompts] --> 777 return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_core\language_models\chat_models.py:639, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs) 634 raise e 635 flattened_outputs = [ 636 LLMResult(generations=[res.generations], llm_output=res.llm_output) # type: ignore[list-item] 637 for res in results 638 ] --> 639 llm_output = self._combine_llm_outputs([res.llm_output for res in results]) 640 generations = [res.generations for res in results] 641 output = LLMResult(generations=generations, llm_output=llm_output) # type: ignore[arg-type]

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_openai\chat_models\base.py:467, in BaseChatOpenAI._combine_llm_outputs(self, llm_outputs) 465 for k, v in token_usage.items(): 466 if k in overall_token_usage: --> 467 overall_token_usage[k] += v 468 else: 469 overall_token_usage[k] = v

TypeError: unsupported operand type(s) for +=: 'dict' and 'dict'

Description

The code that I used to successfully execute a task a week ago is no longer functioning. I have attempted to resolve this issue by updating the relevant libraries (langchain, langchain_core, langchain_openai, openai), but unfortunately, these changes have not been effective.

System Info

langchain-openai : 0.1.24 openai : 1.45.0 langchain : 0.2.6 langchain-core : 0.2.39 python : 3.12

thomas-busser-polynom commented 2 weeks ago

Did you find how to resolve the issue ?

hwchase17 commented 2 weeks ago

this is a new param they added

if you upgrade your langchain-openai package and import from there (from langchain_openai import ChatOpenAI) that should be fixed - if not, let us know!

we are working on updating community now

KalyaniBogala commented 2 weeks ago

After updating all the four packages langchain, langchain-core, langchain-openai, openai, the code is working fine.

Thank you

kentyman23 commented 2 weeks ago

this is a new param they added

if you upgrade your langchain-openai package and import from there (from langchain_openai import ChatOpenAI) that should be fixed - if not, let us know!

we are working on updating community now

@hwchase17, can you give more detailed information about what version of what package broke, what commit fixed it, what versions of which packages are required, etc.? I've been battling with this problem since last week but even updating the above packages didn't help me.

kentyman23 commented 2 weeks ago

Also, I'm unclear on why this would've broken when I didn't change dependencies. I was using these (albeit old) versions:

langchain                  0.1.20
langchain-community        0.0.38
langchain-core             0.1.52
langchain-openai           0.1.5
langchain-text-splitters   0.0.2
langsmith                  0.1.99
llvmlite                   0.43.0
openai                     1.10.0
pydantic                   2.8.2
pydantic_core              2.20.1
KalyaniBogala commented 2 weeks ago

These are the versions that worked for me langchain==0.2.16 langchain-community==0.2.6 langchain-core==0.2.39 langchain-openai==0.1.24 openai==1.45.0 pydantic==2.8.2 pydantic-core==2.20.1