KeyError: "tool" in langchain_experimental -> OllamaFunctions._generate

pretbc commented 3 months ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from langchain.prompts import ChatPromptTemplate
from langchain_experimental.llms.ollama_functions import OllamaFunctions
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.schema.runnable import RunnableLambda
from langchain.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
documents = loader.load()
doc = documents[0]

model = OllamaFunctions(temperature=0, model=os.environ['OPEN_HERMES_2_5'])

def flatten(matrix):
    flat_list = []
    for row in matrix:
        flat_list += row
    return flat_list

class Paper(BaseModel):
    """Information about papers mentioned."""
    title: str
    author: Optional[str]

class Info(BaseModel):
    """Information to extract"""
    papers: List[Paper]

template = """A article will be passed to you. Extract from it all papers that are mentioned by this article. 

Do not extract the name of the article itself. If no papers are mentioned that's fine - you don't need to extract any! Just return an empty list.

Do not make up or guess ANY extra information. Only extract what exactly is in the text."""

prompt = ChatPromptTemplate.from_messages([
    ("system", template),
    ("human", "{input}")
])

paper_extraction_function = [
    convert_to_openai_function(Info)
]
extraction_model = model.bind(
    functions=paper_extraction_function, 
    function_call={"name":"Info"}
)

extraction_chain = prompt | extraction_model | JsonKeyOutputFunctionsParser(key_name="papers")
text_splitter = RecursiveCharacterTextSplitter(chunk_overlap=0)
prep = RunnableLambda(
    lambda x: [{"input": doc} for doc in text_splitter.split_text(x)]
)

chain = prep | extraction_chain.map() | flatten

chain.invoke(doc.page_content)

Error Message and Stack Trace (if applicable)

-> custom given print for debug dict() at line 105 print(parsed_chat_result.keys()) to check on which chunk error occurred: dict_keys(['tool', 'tool_input']) dict_keys(['tool', 'tool_input']) dict_keys(['tool', 'tool_input']) dict_keys(['tool', 'tool_input']) dict_keys(['tool', 'tool_input']) dict_keys(['tool', 'tool_input']) dict_keys(['tool', 'tool_input']) dict_keys(['tool', 'tool_input']) dict_keys(['tool', 'tool_input']) dict_keys(['tool', 'tool_input']) dict_keys(['thoughts', 'command']) dict_keys(['tool', 'tool_input']) dict_keys(['tool', 'tool_input']) dict_keys(['tool', 'tool_input']) dict_keys(['tool', 'tool_input'])

KeyError Traceback (most recent call last) Cell In[107], line 1 ----> 1 chain.invoke(doc.page_content)

File ~/Workspace/lchain/venv311/lib/python3.11/site-packages/langchain_core/runnables/base.py:2499, in RunnableSequence.invoke(self, input, config) 2497 try: 2498 for i, step in enumerate(self.steps): -> 2499 input = step.invoke( 2500 input, 2501 # mark each step as a child run 2502 patch_config( 2503 config, callbacks=run_manager.get_child(f"seq:step:{i+1}") 2504 ), 2505 ) 2506 # finish the root run 2507 except BaseException as e:

File ~/Workspace/lchain/venv311/lib/python3.11/site-packages/langchain_core/runnables/base.py:4262, in RunnableEachBase.invoke(self, input, config, kwargs) 4259 def invoke( 4260 self, input: List[Input], config: Optional[RunnableConfig] = None, kwargs: Any 4261 ) -> List[Output]: -> 4262 return self._call_with_config(self._invoke, input, config, **kwargs)

File ~/Workspace/lchain/venv311/lib/python3.11/site-packages/langchain_core/runnables/base.py:1625, in Runnable._call_with_config(self, func, input, config, run_type, kwargs) 1621 context = copy_context() 1622 context.run(var_child_runnable_config.set, child_config) 1623 output = cast( 1624 Output, -> 1625 context.run( 1626 call_func_with_variable_args, # type: ignore[arg-type] 1627 func, # type: ignore[arg-type] 1628 input, # type: ignore[arg-type] 1629 config, 1630 run_manager, 1631 kwargs, 1632 ), 1633 ) 1634 except BaseException as e: 1635 run_manager.on_chain_error(e)

File ~/Workspace/lchain/venv311/lib/python3.11/site-packages/langchain_core/runnables/config.py:347, in call_func_with_variable_args(func, input, config, run_manager, kwargs) 345 if run_manager is not None and accepts_run_manager(func): 346 kwargs["run_manager"] = run_manager --> 347 return func(input, kwargs)

File ~/Workspace/lchain/venv311/lib/python3.11/site-packages/langchain_core/runnables/base.py:4255, in RunnableEachBase._invoke(self, inputs, run_manager, config, kwargs) 4248 def _invoke( 4249 self, 4250 inputs: List[Input], (...) 4253 kwargs: Any, 4254 ) -> List[Output]: -> 4255 return self.bound.batch( 4256 inputs, patch_config(config, callbacks=run_manager.get_child()), **kwargs 4257 )

File ~/Workspace/lchain/venv311/lib/python3.11/site-packages/langchain_core/runnables/base.py:2643, in RunnableSequence.batch(self, inputs, config, return_exceptions, **kwargs) 2641 else: 2642 for i, step in enumerate(self.steps): -> 2643 inputs = step.batch( 2644 inputs, 2645 [ 2646 # each step a child run of the corresponding root run 2647 patch_config( 2648 config, callbacks=rm.get_child(f"seq:step:{i+1}") 2649 ) 2650 for rm, config in zip(run_managers, configs) 2651 ], 2652 ) 2654 # finish the root runs 2655 except BaseException as e:

File ~/Workspace/lchain/venv311/lib/python3.11/site-packages/langchain_core/runnables/base.py:4544, in RunnableBindingBase.batch(self, inputs, config, return_exceptions, kwargs) 4542 else: 4543 configs = [self._mergeconfigs(config) for in range(len(inputs))] -> 4544 return self.bound.batch( 4545 inputs, 4546 configs, 4547 return_exceptions=return_exceptions, 4548 {self.kwargs, kwargs}, 4549 )

File ~/Workspace/lchain/venv311/lib/python3.11/site-packages/langchain_core/runnables/base.py:634, in Runnable.batch(self, inputs, config, return_exceptions, **kwargs) 631 return cast(List[Output], [invoke(inputs[0], configs[0])]) 633 with get_executor_for_config(configs[0]) as executor: --> 634 return cast(List[Output], list(executor.map(invoke, inputs, configs)))

File /usr/lib/python3.11/concurrent/futures/_base.py:619, in Executor.map..result_iterator() 616 while fs: 617 # Careful not to keep a reference to the popped future 618 if timeout is None: --> 619 yield _result_or_cancel(fs.pop()) 620 else: 621 yield _result_or_cancel(fs.pop(), end_time - time.monotonic())

File /usr/lib/python3.11/concurrent/futures/_base.py:317, in _result_or_cancel(failed resolving arguments) 315 try: 316 try: --> 317 return fut.result(timeout) 318 finally: 319 fut.cancel()

File /usr/lib/python3.11/concurrent/futures/_base.py:456, in Future.result(self, timeout) 454 raise CancelledError() 455 elif self._state == FINISHED: --> 456 return self.__get_result() 457 else: 458 raise TimeoutError()

File /usr/lib/python3.11/concurrent/futures/_base.py:401, in Future.__get_result(self) 399 if self._exception: 400 try: --> 401 raise self._exception 402 finally: 403 # Break a reference cycle with the exception in self._exception 404 self = None

File /usr/lib/python3.11/concurrent/futures/thread.py:58, in _WorkItem.run(self) 55 return 57 try: ---> 58 result = self.fn(*self.args, **self.kwargs) 59 except BaseException as exc: 60 self.future.set_exception(exc)

File ~/Workspace/lchain/venv311/lib/python3.11/site-packages/langchain_core/runnables/config.py:466, in ContextThreadPoolExecutor.map.._wrapped_fn(args) 465 def _wrapped_fn(args: Any) -> T: --> 466 return contexts.pop().run(fn, *args)

File ~/Workspace/lchain/venv311/lib/python3.11/site-packages/langchain_core/runnables/base.py:627, in Runnable.batch..invoke(input, config) 625 return e 626 else: --> 627 return self.invoke(input, config, **kwargs)

File ~/Workspace/lchain/venv311/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py:158, in BaseChatModel.invoke(self, input, config, stop, kwargs) 147 def invoke( 148 self, 149 input: LanguageModelInput, (...) 153 kwargs: Any, 154 ) -> BaseMessage: 155 config = ensure_config(config) 156 return cast( 157 ChatGeneration, --> 158 self.generate_prompt( 159 [self._convert_input(input)], 160 stop=stop, 161 callbacks=config.get("callbacks"), 162 tags=config.get("tags"), 163 metadata=config.get("metadata"), 164 run_name=config.get("run_name"), 165 run_id=config.pop("run_id", None), 166 **kwargs, 167 ).generations[0][0], 168 ).message

File ~/Workspace/lchain/venv311/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py:560, in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, kwargs) 552 def generate_prompt( 553 self, 554 prompts: List[PromptValue], (...) 557 kwargs: Any, 558 ) -> LLMResult: 559 prompt_messages = [p.to_messages() for p in prompts] --> 560 return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)

File ~/Workspace/lchain/venv311/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py:421, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs) 419 if run_managers: 420 run_managers[i].on_llm_error(e, response=LLMResult(generations=[])) --> 421 raise e 422 flattened_outputs = [ 423 LLMResult(generations=[res.generations], llm_output=res.llm_output) # type: ignore[list-item] 424 for res in results 425 ] 426 llm_output = self._combine_llm_outputs([res.llm_output for res in results])

File ~/Workspace/lchain/venv311/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py:411, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, run_id, kwargs) 408 for i, m in enumerate(messages): 409 try: 410 results.append( --> 411 self._generate_with_cache( 412 m, 413 stop=stop, 414 run_manager=run_managers[i] if run_managers else None, 415 kwargs, 416 ) 417 ) 418 except BaseException as e: 419 if run_managers:

File ~/Workspace/lchain/venv311/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py:632, in BaseChatModel._generate_with_cache(self, messages, stop, run_manager, kwargs) 630 else: 631 if inspect.signature(self._generate).parameters.get("run_manager"): --> 632 result = self._generate( 633 messages, stop=stop, run_manager=run_manager, kwargs 634 ) 635 else: 636 result = self._generate(messages, stop=stop, **kwargs)

File ~/Workspace/lchain/venv311/lib/python3.11/site-packages/langchain_experimental/llms/ollama_functions.py:107, in OllamaFunctions._generate(self, messages, stop, run_manager, **kwargs) 101 raise ValueError( 102 f'"{self.llm.model}" did not respond with valid JSON. Please try again.' 103 ) 105 print(parsed_chat_result.keys()) #CUSTOM added for DEBUG --> 107 called_tool_name = parsed_chat_result["tool"] 108 called_tool_arguments = parsed_chat_result["tool_input"] 109 called_tool = next( 110 (fn for fn in functions if fn["name"] == called_tool_name), None 111 )

KeyError: 'tool'

Description

While doing one of tutorial from DLAI an issue occured in function OllamaFunctions._generate from langchain_experimental pkg.

I use given article and I tried to parse it by follow tutorial steps. ( check python code )

The issue is that sometimes dict keys() in OllamaFunctions._generate doesn't contain dict_keys(['tool', 'tool_input']) rather other values as dict_keys(['thoughts', 'command']) which end up with KeyError.

Above code steps worked in tutorial ( for ChatOpenAI) but I did not try OpenAI chat because I do not have api key, and Im using Ollama local openhermes_2.5_7b_q5_k_m.

What I have observed:

len(doc.page_content) == 43902

there is no issue when

chain.invoke(doc.page_content[:30000])

and issue starts for:

chain.invoke(doc.page_content[:40000])

For me in such cases expect KeyError handling should be added and allow user get final result with some info or other error should be raised to be more preciously

System Info

System Information

OS: Linux OS Version: #28~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Mar 15 10:51:06 UTC 2 Python Version: 3.11.9 (main, Apr 6 2024, 17:59:24) [GCC 11.4.0]

Package Information

langchain_core: 0.1.43 langchain: 0.1.16 langchain_community: 0.0.33 langsmith: 0.1.48 langchain_experimental: 0.0.57 langchain_text_splitters: 0.0.1

Packages not installed (Not Necessarily a Problem)

The following packages were not found:

langgraph langserve

Ollama server | openhermes_2.5_7b_q5_k_m | CUDA

baskargopinath commented 1 month ago

did u find a fix @pretbc

lalanikarim commented 1 month ago

Take a look at #22339 which should have addressed this issue. The PR was approved and merged yesterday but a release is yet to be cut from it and should happen in the next few days.

In the meantime, you may try and install langchain-experimental directly from langchain's source like this:

pip install git+https://github.com/langchain-ai/langchain.git\#egg=langchain-experimental\&subdirectory=libs/experimental

I hope this helps.

langchain-ai / langchain