MeetKai / functionary

Chat language model that can use tools and interpret the results
MIT License
1.42k stars 110 forks source link

Functionary model fails to correctly execute functions with autogen #148

Closed patrickwasp closed 7 months ago

patrickwasp commented 7 months ago

When using autogen I can't get functionary to call the defined function. The exact same example using chatgpt-3.5-turbo works as expected.

The model is loaded in lm-studio.

This is the request the lm-studio api gets:

Received POST request to /v1/chat/completions with body: {
  "messages": [
    {
      "content": "For currency exchange tasks, only use the functions you have been provided with. Reply TERMINATE when the task is done.",
      "role": "system"
    },
    {
      "content": "How much is 112.23 Euros in US Dollars?",
      "role": "user"
    }
  ],
  "model": "meetkai/functionary-small-v2.4-GGUF/functionary-small-v2.4.f16.gguf",
  "stream": false,
  "tools": [
    {
      "type": "function",
      "function": {
        "description": "Currency exchange calculator.",
        "name": "currency_calculator",
        "parameters": {
          "type": "object",
          "properties": {
            "base": {
              "properties": {
                "currency": {
                  "description": "Currency symbol",
                  "enum": [
                    "USD",
                    "EUR"
                  ],
                  "title": "Currency",
                  "type": "string"
                },
                "amount": {
                  "default": 0,
                  "description": "Amount of currency",
                  "minimum": 0,
                  "title": "Amount",
                  "type": "number"
                }
              },
              "required": [
                "currency"
              ],
              "title": "Currency",
              "type": "object",
              "description": "Base currency: amount and currency symbol"
            },
            "quote_currency": {
              "enum": [
                "USD",
                "EUR"
              ],
              "type": "string",
              "default": "USD",
              "description": "Quote currency symbol"
            }
          },
          "required": [
            "base"
          ]
        }
      }
    }
  ]
}

The output of the llm begins with the following and never calls the provided function:

To perform this calculation, we need to know the exchange rate from Euros (EUR) to US Dollars (USD). The exchange rate can fluctuate and it may be different today than it was when you read this instruction. As of my last update in April 2023, the approximate conversion rate for

example.py

from typing import Literal

from pydantic import BaseModel, Field
from typing_extensions import Annotated

import autogen
from autogen.cache import Cache

function_caller_config_list = [
{
        'model': 'meetkai/functionary-small-v2.4-GGUF/functionary-small-v2.4.f16.gguf',
        'api_key': 'lm-studio',
        'base_url': 'http://localhost:1234/v1'
    }
]

function_calling_llm_config = {
    "config_list": function_caller_config_list,
    "timeout": 120,

}

CurrencySymbol = Literal["USD", "EUR"]

def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:
    if base_currency == quote_currency:
        return 1.0
    elif base_currency == "USD" and quote_currency == "EUR":
        return 1 / 1.1
    elif base_currency == "EUR" and quote_currency == "USD":
        return 1.1
    else:
        raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}")

class Currency(BaseModel):
    currency: Annotated[CurrencySymbol, Field(..., description="Currency symbol")]
    amount: Annotated[float, Field(0, description="Amount of currency", ge=0)]

user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    is_termination_msg=lambda x: x.get("content", "") and x.get("content", "").rstrip().endswith("TERMINATE"),
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
)

chatbot = autogen.AssistantAgent(
    name="chatbot",
    system_message="For currency exchange tasks, only use the functions you have been provided with. Reply TERMINATE when the task is done.",
    llm_config=function_calling_llm_config,
)

def currency_calculator(
    base: Annotated[Currency, "Base currency: amount and currency symbol"],
    quote_currency: Annotated[CurrencySymbol, "Quote currency symbol"] = "USD",
) -> Currency:
    quote_amount = exchange_rate(base.currency, quote_currency) * base.amount
    return Currency(amount=quote_amount, currency=quote_currency)

autogen.agentchat.register_function(
    currency_calculator,
    caller=chatbot,
    executor=user_proxy,
    description="Currency exchange calculator.",
)

with Cache.disk() as cache:
    # start the conversation
    res = user_proxy.initiate_chat(
        chatbot, message="How much is 112.23 Euros in US Dollars?", summary_method="last_msg", cache=cache
    )
jeffreymeetkai commented 7 months ago

Hi, I believe it is because LM-Studio's server is not fully compatible with Functionary in terms of function-calling. Mainly, the tools schema is not generated correctly by LM-Studio's OpenAI server, and the parsing of the raw text to convert to tool calls.

Our model requires the tools to be passed in following a certain Typescript schema as mentioned here. However, LM-Studio's server does not have this schema generation specific to Functionary. Therefore, the model is not given the correct tools schema.

A way to enable tool use will be to generate the schema on your end and pass it in a system message to LM-Studio's server. The generate_schema_from_functions function in functionary/schema.py is responsible for generating the tools schema. The code below works for me:

# Chat with an intelligent assistant in your terminal
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional

from schema import generate_schema_from_functions

# Point to the local server
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")

class Function(BaseModel):
    name: str
    description: Optional[str] = Field(default="")
    parameters: Optional[dict] = None

tool = {
    "type": "function",
    "function": {
        "name": "get_current_weather",
        "description": "Get the current weather",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA"
                }
            },
            "required": ["location"]
        }
    }
}

history = [
    {"role": "system", "content": generate_schema_from_functions([tool["function"]])},
    {"role": "system", "content": "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. The assistant calls functions with appropriate input when necessary"},
    {"role": "user", "content": "What is the weather in Hanoi?"},
]

while True:
    completion = client.chat.completions.create(
        model="meetkai/functionary-small-v2.4-GGUF",
        messages=history,
        temperature=0.1,
        tools=[tool],
        stream=True,
        stop="<|stop|>"
    )

    new_message = {"role": "assistant", "content": ""}

    for chunk in completion:
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="", flush=True)
            new_message["content"] += chunk.choices[0].delta.content

    history.append(new_message)

    # Uncomment to see chat history
    # import json
    # gray_color = "\033[90m"
    # reset_color = "\033[0m"
    # print(f"{gray_color}\n{'-'*20} History dump {'-'*20}\n")
    # print(json.dumps(history, indent=2))
    # print(f"\n{'-'*55}\n{reset_color}")

    print()
    history.append({"role": "user", "content": input("> ")})

The output will be:


<|from|> assistant
<|recipient|> get_current_weather
<|content|> {"location": "Hanoi"}

Note that since LM-Studio is not using our custom handling of the raw output text, there is no parsing of the output text to return any tool calls. Thus you'll need to write your own parser to parse the raw text yourself following the prompt template in the link above.

Alternatively, you can try with our vLLM server or the llama-cpp-python integration. Our vLLM OpenAI-compatible server works with either the standard HF weights or the new 4-bit AWQ weights. Our OpenAI-compatible server in llama-cpp-python is detailed here.

Hope this helps!