Open woodswift opened 4 months ago
Can you confirm if Mistral-7B-OpenOrca
supports function call format in its prompt template? To enable function all, the model itself needs to support/fine-tuned with function call format as well.
I can not repro, the above code works for me. I would suggest checking that the correct OAI_CONFIG_FILE is being picked up, because the message ValueError: Model client "CustomModelClient" is being registered but was not found in the config_list. Please make sure to include an entry in the config_list with "model_client_cls": "CustomModelClient"
implies that
Can you confirm if
Mistral-7B-OpenOrca
supports function call format in its prompt template? To enable function all, the model itself needs to support/fine-tuned with function call format as well.
Yeah. That's really a good call-out! I missed that.
I finally got myself unblocked by using the class CustomModelClientWithArguments
. However, tool calling was not successful, and my best guess was that Mistral-7B-OpenOrca
was not finetuned for tool calling task.
Here is my solution:
import math
from types import SimpleNamespace, Optional, Type
import autogen
from autogen import AssistantAgent, UserProxyAgent
from transformers import AutoTokenizer, GenerationConfig, AutoModelForCausalLM
from langchain.pydantic_v1 import BaseModel, Field
from langchain.tools import BaseTool
class CustomModelClient:
def __init__(self, config, **kwargs):
self.device = config.get("device", "cpu")
self.model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path = config["model"])
self.model_name = config["model"]
self.tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path = config["model"], use_fast = False)
self.tokenizer.pad_token_id = self.tokenizer.eos_token_id
# params are set by the user and consumed by the user since they are providing a custom model
# so anything can be done here
gen_config_params = config.get("params", {})
self.max_length = gen_config_params.get("max_length", 256)
def create(self, params):
if params.get("stream", False) and "messages" in params:
raise NotImplementedError("Local models do not support streaming.")
else:
num_of_responses = params.get("n", 1)
# can create my own data response class
# here using SimpleNamespace for simplicity
# as long as it adheres to the ClientResponseProtocol
response = SimpleNamespace()
inputs = self.tokenizer.apply_chat_template(
conversation = params["messages"],
return_tensors="pt",
add_generation_prompt=True
).to(self.device)
inputs_length = inputs.shape[-1]
# add inputs_length to max_length
max_length = self.max_length + inputs_length
generation_config = GenerationConfig(
max_length = max_length,
eos_token_id = self.tokenizer.eos_token_id,
pad_token_id = self.tokenizer.pad_token_id,
)
response.choices = []
response.model = self.model_name
for _ in range(num_of_responses):
outputs = self.model.generate(
inputs = inputs,
generation_config=generation_config
)
# Decode only the newly generated text, excluding the prompt
text = self.tokenizer.decode(token_ids = outputs[0, inputs_length:])
choice = SimpleNamespace()
choice.message = SimpleNamespace()
choice.message.content = text
choice.message.function_call = None
response.choices.append(choice)
return response
def message_retrieval(self, response):
"""Retrieve the messages from the response."""
choices = response.choices
return [choice.message.content for choice in choices]
def cost(self, response) -> float:
"""Calculate the cost of the response."""
response.cost = 0
return 0
@staticmethod
def get_usage(response):
# returns a dict of prompt_tokens, completion_tokens, total_tokens, cost, model
# if usage needs to be tracked, else None
return {}
class CustomModelClientWithArguments(CustomModelClient):
def __init__(self, config, loaded_model, tokenizer, **kwargs):
logger.info(f"CustomModelClientWithArguments config: {config}")
self.device = config.get("device", "cpu")
self.model = loaded_model
self.model_name = config["model"]
self.tokenizer = tokenizer
self.tokenizer.pad_token_id = tokenizer.eos_token_id
gen_config_params = config.get("params", {})
self.max_length = gen_config_params.get("max_length", 256)
class CustomToolInput(BaseModel):
income: float = Field()
class CustomTool(BaseTool):
name = "tax_calculator"
description = "Use this tool when you need to calculate the tax using the income"
args_schema: Type[BaseModel] = CustomToolInput
def _run(self, income: float):
return float(income) * math.pi / 100
# Define a function to generate llm_config from a LangChain tool
def generate_llm_config(tool):
# Define the function schema based on the tool's args_schema
function_schema = {
"name": tool.name.lower().replace(" ", "_"),
"description": tool.description,
"parameters": {
"type": "object",
"properties": {},
"required": [],
},
}
if tool.args is not None:
function_schema["parameters"]["properties"] = tool.args
return function_schema
custom_tool = CustomTool()
config_list_custom = autogen.config_list_from_json(
env_or_file = "OAI_CONFIG_LIST.json",
file_location = "/dev/",
filter_dict = {"model_client_cls": ["CustomModelClientWithArguments"]},
)
config = config_list_custom[0]
loaded_model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path = config["model"])
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path = config["model"], use_fast = False)
user_proxy = UserProxyAgent(
name = "user_proxy",
is_termination_msg = lambda x: x.get("content", "") and x.get("content", "").rstrip().endswith("TERMINATE"),
human_input_mode = "NEVER",
max_consecutive_auto_reply = 2,
code_execution_config = {
"work_dir": "coding",
"use_docker": False, # Please set use_docker=True if docker is available to run the generated code. Using docker is safer than running the generated code directly.
"timeout": 600,
"last_n_messages": 1
},
)
user_proxy.register_function(
function_map={
custom_tool.name: custom_tool._run
}
)
llm_config = {
"functions": [generate_llm_config(custom_tool)],
"config_list": config_list_custom,
"timeout": 120
}
assistant = AssistantAgent(
name = "assistant",
llm_config = llm_config,
system_message = "For coding tasks, only use the functions you have been provided with. Reply TERMINATE when the task is done."
)
assistant.register_model_client(
model_client_cls = CustomModelClientWithArguments,
loaded_model = loaded_model,
tokenizer = tokenizer
)
with autogen.Cache.disk():
user_proxy.initiate_chat(assistant, message="when the income is 100, calculate the tax")
I posted this issue to our Discord channel to see if there are some there that can help. https://discord.com/channels/1153072414184452236/1201369716057440287
I posted this issue to our Discord channel to see if there are some there that can help. https://discord.com/channels/1153072414184452236/1201369716057440287
Thank you! But I am seeing nothing after clicking the link. I am new to Discord, do I miss anything?
you might need a fine-tuned model. Trelis on huggingface has a couple,, but there is a dataset that you can use if you want to train your own. And for the discord autogen server I think you have to look for the channel #alt-models
@woodswift recently mistral model have started to support tool call. Have you checked? https://docs.mistral.ai/api/#operation/createChatCompletion
@woodswift recently mistral model have started to support tool call. Have you checked? https://docs.mistral.ai/api/#operation/createChatCompletion
Oh, thank you for sharing! I have not tried it yet, so will do shortly :)
@woodswift, you should be able to do it through LiteLLM+Ollama (note: Ollama released a new version, 0.1.29, you'll need that). You can also test through together.ai who have Mistral and Mixtral models that support function calling).
Oh, if you are using LiteLLM + Ollama, please be sure to use "ollama_chat/
@woodswift, you should be able to do it through LiteLLM+Ollama (note: Ollama released a new version, 0.1.29, you'll need that). You can also test through together.ai who have Mistral and Mixtral models that support function calling).
Oh, if you are using LiteLLM + Ollama, please be sure to use "ollama_chat/
" rather than "ollama/ ".
Is that work?
@woodswift can you update us?
@woodswift, you should be able to do it through LiteLLM+Ollama (note: Ollama released a new version, 0.1.29, you'll need that). You can also test through together.ai who have Mistral and Mixtral models that support function calling).
Oh, if you are using LiteLLM + Ollama, please be sure to use "ollama_chat/" rather than "ollama/".
hi, I use ollama with Mistral, but still cann't use function calling :( and I don't understand what dose "if you are using LiteLLM + Ollama, please be sure to use "ollama_chat/" rather than "ollama/" " means, can you explain that more?
Hi @JarkimZhu, no problem, when you run your LiteLLM server you need to use "ollama_chat" instead of "ollama", here's an example:
litellm --model ollama_chat/llama2
Thank you!
Can you tell us how you are running the model? LiteLLM + Ollama, together.ai, etc.
If it's LiteLLM can you please share the command line.
And any sample code you are using would help.
Thanks!
Describe the issue
I am trying to combine the following two notebooks into one:
In my simple and naive script to test the concept, I created two agent instances
assistant
asAssistantAgent()
anduser_proxy
asUserProxyAgent()
. The challenge is how to initialize and registerassistant
when I need to inform both ofassistant
anduser_proxy
of the provided tools as functions. However, it will raise error in running eitherassistant.register_model_client()
oruser_proxy.initiate_chat()
. I don't know if the problem is my script or if there is a bug. I am grateful if you can help me on this.Steps to reproduce
Step 1: Confirm the custom model's local path
Confirm the model
Mistral-7B-OpenOrca
exists in the file path/dev/Open-Orca/Mistral-7B-OpenOrca
locallyStep 2: Create JSON file named as
OAI_CONFIG_LIST.json
under the directory/dev/
with the following contentStep 3: Run following Python script
Screenshots and logs
I am getting error after running
assistant.register_model_client(model_client_cls = CustomModelClient)
:and I am getting error after running
user_proxy.initiate_chat(assistant, message="when the income is 100, calculate the tax")
:Additional Information
If I do what is shown below instead, the script will run and complete. However, the custom model is obviously NOT aware of the tool provided based on the output.