Closed patrickwasp closed 7 months ago
Hi, I believe it is because LM-Studio's server is not fully compatible with Functionary in terms of function-calling. Mainly, the tools schema is not generated correctly by LM-Studio's OpenAI server, and the parsing of the raw text to convert to tool calls.
Our model requires the tools to be passed in following a certain Typescript schema as mentioned here. However, LM-Studio's server does not have this schema generation specific to Functionary. Therefore, the model is not given the correct tools schema.
A way to enable tool use will be to generate the schema on your end and pass it in a system message to LM-Studio's server. The generate_schema_from_functions
function in functionary/schema.py
is responsible for generating the tools schema. The code below works for me:
# Chat with an intelligent assistant in your terminal
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional
from schema import generate_schema_from_functions
# Point to the local server
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
class Function(BaseModel):
name: str
description: Optional[str] = Field(default="")
parameters: Optional[dict] = None
tool = {
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
}
},
"required": ["location"]
}
}
}
history = [
{"role": "system", "content": generate_schema_from_functions([tool["function"]])},
{"role": "system", "content": "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. The assistant calls functions with appropriate input when necessary"},
{"role": "user", "content": "What is the weather in Hanoi?"},
]
while True:
completion = client.chat.completions.create(
model="meetkai/functionary-small-v2.4-GGUF",
messages=history,
temperature=0.1,
tools=[tool],
stream=True,
stop="<|stop|>"
)
new_message = {"role": "assistant", "content": ""}
for chunk in completion:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
new_message["content"] += chunk.choices[0].delta.content
history.append(new_message)
# Uncomment to see chat history
# import json
# gray_color = "\033[90m"
# reset_color = "\033[0m"
# print(f"{gray_color}\n{'-'*20} History dump {'-'*20}\n")
# print(json.dumps(history, indent=2))
# print(f"\n{'-'*55}\n{reset_color}")
print()
history.append({"role": "user", "content": input("> ")})
The output will be:
<|from|> assistant
<|recipient|> get_current_weather
<|content|> {"location": "Hanoi"}
Note that since LM-Studio is not using our custom handling of the raw output text, there is no parsing of the output text to return any tool calls. Thus you'll need to write your own parser to parse the raw text yourself following the prompt template in the link above.
Alternatively, you can try with our vLLM server or the llama-cpp-python integration. Our vLLM OpenAI-compatible server works with either the standard HF weights or the new 4-bit AWQ weights. Our OpenAI-compatible server in llama-cpp-python is detailed here.
Hope this helps!
When using autogen I can't get functionary to call the defined function. The exact same example using chatgpt-3.5-turbo works as expected.
The model is loaded in lm-studio.
This is the request the lm-studio api gets:
The output of the llm begins with the following and never calls the provided function:
example.py