langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
94.28k stars 15.24k forks source link

Tools do not work with HuggingFace - Issue either with tutorial or library #22379

Closed saptarshi091 closed 1 week ago

saptarshi091 commented 5 months ago

Checked other resources

Example Code

I followed the exact steps in: https://python.langchain.com/v0.2/docs/integrations/chat/huggingface/ However, it does not work. The moment I try to bind_tools with my model, the code throws an error.

from langchain_core.output_parsers.openai_tools import PydanticToolsParser
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_huggingface import HuggingFaceEndpoint, ChatHuggingFace

class Calculator(BaseModel):
        """Multiply two integers together."""

        a: int = Field(..., description="First integer")
        b: int = Field(..., description="Second integer")

llm = HuggingFaceEndpoint(
          repo_id="HuggingFaceH4/zephyr-7b-beta",
          task="text-generation",
          max_new_tokens=100,
          do_sample=False,
          seed=42
      )

chat_model = ChatHuggingFace(llm=llm)
llm_with_multiply = chat_model.bind_tools([Calculator], tool_choice="auto")
parser = PydanticToolsParser(tools=[Calculator])
tool_chain = llm_with_multiply | parser
tool_chain.invoke("How much is 3 multiplied by 12?")

Error Message and Stack Trace (if applicable)

warnings.warn(
Traceback (most recent call last):
  File "/Users/ssengupta/Desktop/LangchainTests/Langchain_Trials/llm_application_with_calc.py", line 69, in <module>
    tool_chain.invoke("How much is 3 multiplied by 12?")
  File "/Users/ssengupta/Desktop/LangchainTests/Langchain_Trials/.venv/lib/python3.12/site-packages/langchain_core/runnables/base.py", line 2399, in invoke
    input = step.invoke(
            ^^^^^^^^^^^^
  File "/Users/ssengupta/Desktop/LangchainTests/Langchain_Trials/.venv/lib/python3.12/site-packages/langchain_core/runnables/base.py", line 4433, in invoke
    return self.bound.invoke(
           ^^^^^^^^^^^^^^^^^^
  File "/Users/ssengupta/Desktop/LangchainTests/Langchain_Trials/.venv/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 170, in invoke
    self.generate_prompt(
  File "/Users/ssengupta/Desktop/LangchainTests/Langchain_Trials/.venv/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 599, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ssengupta/Desktop/LangchainTests/Langchain_Trials/.venv/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 456, in generate
    raise e
  File "/Users/ssengupta/Desktop/LangchainTests/Langchain_Trials/.venv/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 446, in generate
    self._generate_with_cache(
  File "/Users/ssengupta/Desktop/LangchainTests/Langchain_Trials/.venv/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 671, in _generate_with_cache
    result = self._generate(
             ^^^^^^^^^^^^^^^
  File "/Users/ssengupta/Desktop/LangchainTests/Langchain_Trials/.venv/lib/python3.12/site-packages/langchain_huggingface/chat_models/huggingface.py", line 212, in _generate
    return self._create_chat_result(answer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ssengupta/Desktop/LangchainTests/Langchain_Trials/.venv/lib/python3.12/site-packages/langchain_huggingface/chat_models/huggingface.py", line 189, in _create_chat_result
    message=_convert_TGI_message_to_LC_message(response.choices[0].message),
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ssengupta/Desktop/LangchainTests/Langchain_Trials/.venv/lib/python3.12/site-packages/langchain_huggingface/chat_models/huggingface.py", line 102, in _convert_TGI_message_to_LC_message
    if "arguments" in tool_calls[0]["function"]:
                      ~~~~~~~~~~^^^
  File "/Users/ssengupta/Desktop/LangchainTests/Langchain_Trials/.venv/lib/python3.12/site-packages/huggingface_hub/inference/_generated/types/base.py", line 144, in __getitem__
    return super().__getitem__(__key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 0

Description

I'm following the tutorial exactly. However, I still get the issue above. I even downgraded to 0.2.2, it doesn't work.

System Info

System Information
------------------
> OS:  Darwin
> OS Version:  Darwin Kernel Version 23.5.0: Wed May  1 20:09:52 PDT 2024; root:xnu-10063.121.3~5/RELEASE_X86_64
> Python Version:  3.12.3 (v3.12.3:f6650f9ad7, Apr  9 2024, 08:18:48) [Clang 13.0.0 (clang-1300.0.29.30)]

Package Information
-------------------
> langchain_core: 0.2.3
> langchain: 0.2.1
> langchain_community: 0.2.1
> langsmith: 0.1.67
> langchain_huggingface: 0.0.1
> langchain_text_splitters: 0.2.0
> langchainhub: 0.1.17
> langgraph: 0.0.60

Packages not installed (Not Necessarily a Problem)
--------------------------------------------------
The following packages were not found:

> langserve
frankoz07 commented 4 months ago

from langchain_core.output_parsers.openai_tools import PydanticToolsParser from langchain_core.pydantic_v1 import BaseModel, Field from langchain_huggingface import HuggingFaceEndpoint, ChatHuggingFace

class Calculator(BaseModel): """Multiply two integers together."""

    a: int = Field(..., description="First integer")
    b: int = Field(..., description="Second integer")

llm = HuggingFaceEndpoint( repo_id="HuggingFaceH4/zephyr-7b-beta", task="text-generation", max_new_tokens=100, do_sample=False, seed=42 )

chat_model = ChatHuggingFace(llm=llm) llm_with_multiply = chat_model.bind_tools([Calculator], tool_choice="auto") parser = PydanticToolsParser(tools=[Calculator]) tool_chain = llm_with_multiply | parser tool_chain.invoke("How much is 3 multiplied by 12?")

Same code, I didn't get any error, but it was just no output. output: ..[]

MathisBeom commented 4 months ago

I'm facing the same issue, does anyone found a solution please ?

NedDavies commented 4 months ago

@baskaryan and @michalgregor did amazing work getting HuggingFacePipelines and ChatHuggingFace compatible with tool use and writing some docs, and it seems 90% of the way there but I am also struggling to get it to actually use the tools.

I have tried to recreate the example using zephyr and other models like mistral-instruct but I am also getting an empty string when using a custom function. Same code works fine if I use OpenAI.

If I use an inbuilt tool like Tavily, the model just appears not to 'see' the tools:

`from langchain_community.tools.tavily_search import TavilySearchResults import os

os.environ["TAVILY_API_KEY"] = '...' search = TavilySearchResults(max_results=2) tools = [search]

model_with_tools = chat_model.bind_tools(tools)

response = model_with_tools.invoke([HumanMessage(content="What's the weather in SF?")])

` The response i get is:

ContentString: [INST] What's the weather in SF? [/INST] I'd be happy to help you with that! However, I'll need to check a reliable weather source to provide you with an accurate answer. According to Weather.com, the current weather in San Francisco is mostly cloudy with a temperature of 58°F (14.4°C). Please keep in mind that weather conditions can change frequently, so it's always a good idea to check multiple sources or the latest forecast for the most up-to-date information. ToolCalls: []

The tool seems to have 'bound' to the model just fine, when i inspect the model:

RunnableBinding(bound=ChatHuggingFace(llm=HuggingFacePipeline(pipeline=<transformers.pipelines.text_generation.TextGenerationPipeline object at 0x18f8ffc10>, model_id='mistralai/Mistral-7B-Instruct-v0.2', model_kwargs={}, pipeline_kwargs={'max_new_tokens': 512, 'do_sample': False, 'repetition_penalty': 1.03}), tokenizer=LlamaTokenizerFast(name_or_path='mistralai/Mistral-7B-Instruct-v0.2', vocab_size=32000, model_max_length=1000000000000000019884624838656, is_fast=True, padding_side='left', truncation_side='right', special_tokens={'bos_token': '', 'eos_token': '', 'unk_token': ''}, clean_up_tokenization_spaces=False), added_tokens_decoder={ 0: AddedToken("", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 1: AddedToken("", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 2: AddedToken("", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), }, model_id='mistralai/Mistral-7B-Instruct-v0.2'), kwargs={'tools': [{'type': 'function', 'function': {'name': 'tavily_search_results_json', 'description': 'A search engine optimized for comprehensive, accurate, and trusted results. Useful for when you need to answer questions about current events. Input should be a search query.', 'parameters': {'type': 'object', 'properties': {'query': {'description': 'search query to look up', 'type': 'string'}}, 'required': ['query']}}}]})

Any thoughts on this would be appreciated.

NedDavies commented 4 months ago

I replicated the error in a colab notebook (had to use gemma 2B as everything else crashed in colab) :'(

https://colab.research.google.com/drive/1rYF_f4Tx6Dx2zRrEWWvWkRqXgW_LSnsM?usp=sharing

Rajjaa commented 3 months ago

I also couldn't replicate the tutorial.

According to the trace of execution on Smith, the tool is called and the chat model receives a ToolMessage. However, the chat model ignores the ToolMessage and generates another tool call.

michalgregor commented 3 months ago

Hi, everyone, I can confirm that tools are not supported yet – it is just the chat that works. I guess we should add a warning / error to that effect while we work on implementing it. 🤔

saptarshi091 commented 3 months ago

@michalgregor So you're saying tools are not supported yet for open-source models or closed-models like GPT-4?

michalgregor commented 3 months ago

As far as I can see, closed-source models are not relevant at all in this context, since we are talking about the HuggingFace package, yes?

The issue with HuggingFace is that they don't really support tools in the transformers library (some support is gradually starting to appear with particular models and tokenizers). The reason it is supported with API endpoints, AFAIK, is that tools are handled in some more generic way in the text-generation-inference package (or indeed whatever else HuggingFace is using to provide the APIs internally).

What I think we should do is ascertain how exactly text-generation-inference handles this (the code is not super easy to read, so unless we can get information form authors, the most straight-forward way might be to just compile a version, which dumps the generated inputs to the LLM into stdout or a file so that we can inspect them) and replicate it. The issue, of course, is that every model structures its input prompts differently (e.g. some have special markup for system prompts, for messages, some even have explicit support for function calling, etc.). I can think of different ways of inserting descriptions of available tools, etc. into prompts, but it would be nice to adhere to whatever HuggingFace is already doing for their API.

Alternatively, we could wait for HuggingFace to improve its support for tools and then try to build an interface over that – but that seems to me like the less ideal option tbh.

saptarshi091 commented 3 months ago

Ah yes, my mistake, closed-source models are not relevant to this context since we're dealing with HF open-source models.

But yes, I understand what you're saying. I guess the way forward is similar to what you've indicated i.e. to see if the open-source LLM understands the role of the tools that we provide & then whether they are able to provide the correct tool parameters for us to call it.

But yes, in the meantime, adding a warning or recommendation in the tutorial would let people know that it may or may not work.