Open bitsofinfo opened 7 months ago
exposing functionary model via LM Studio (via its openai api server) doesn't seem to work, i just get back human responses rather than functions.
in any case I next tried exposing the model via llama.cpp server per the doc:
python3 -m llama_cpp.server \
--model path/to/functionary/functionary-small-v2.4.Q8_0.gguf \
--chat_format functionary-v2 \
--hf_pretrained_model_name_or_path path/to/functionary
then running a modified chatlab example:
import openai
import os
import asyncio
import chatlab
from pydantic import BaseModel
from typing import List, Optional
async def main():
openai.api_key = "functionary" # We just need to set this something other than None
os.environ['OPENAI_API_KEY'] = "functionary" # chatlab requires us to set this too
os.environ['OPENAI_API_BASE'] = "http://localhost:1234/v1" # chatlab requires us to set this too
openai.base_url = "http://localhost:1234/v1"
# now provide the function with description
def get_car_price(car_name: Optional[str] = None):
"""this function is used to get the price of the car given the name
:param car_name: name of the car to get the price
"""
car_price = {
"tang": {"price": "$20000"},
"song": {"price": "$25000"}
}
for key in car_price:
if key in car_name.lower():
return {"price": car_price[key]}
return {"price": "unknown"}
class CarPrice(BaseModel):
car_name: Optional[str]
chat = chatlab.Chat(model="meetkai/functionary-small-v2.4",base_url="http://localhost:8000/v1")
# Register our function
f = chat.register(get_car_price, CarPrice)
print(f)
await chat.submit("What is the price of the car named tang?") # submit user prompt
print(chat.messages)
if __name__ == "__main__":
asyncio.run(main())
I get in the llama.cpp server stdout:
File "/path/to/rnd/ai/functionary/functionary.ve/lib/python3.12/site-packages/llama_cpp/llama.py", line 1655, in create_chat_completion
return handler(
^^^^^^^^
File "/rnd/ai/functionary/functionary.ve/lib/python3.12/site-packages/llama_cpp/llama_chat_format.py", line 1880, in functionary_v1_v2_chat_handler
assert stream is False # TODO: support stream mode
^^^^^^^^^^^^^^^^^^^^^^
AssertionError
INFO: ::1:59817 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
which relates to your note in the doc regarding llama-cpp-python's OpenAI-compatible server does not support streaming for Functionary models yet as of v0.2.50.
not sure where to go from here.
Hi, thank you for your interest in our model.
submit
method in Chatlab calls the llama-cpp-python server with streaming turned on by default. Unfortunately, streaming is not yet supported currently in the llama-cpp-python integration. This explains the error that you encountered. You can call the submit method with stream=False
and it will work. FYI, I'm using chatlab==1.3.0
. Not sure if it's easier with latest versions of chatlab. Will check on this soon. Hopefully, your use case doesn't require streaming!thanks for the response. no it doesn't require streaming, i'll try this stream=False
option
doing that I get back
await chat.submit("What is the price of the car named tang?", stream=False) # submit user prompt
print(chat.messages)
{
"error": {
"message": "[{\"type\": \"literal_error\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"role\"), \"msg\": "Input should be \"system\"", \"input\": \"assistant\", \"ctx\": {\"expected\": "\"system\""
}, \"url\": \"https://errors.pydantic.dev/2.6/v/literal_error\"}, {\"type\": \"missing\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"content\"), \"msg\": \"Field required\", \"input\": {\"role\": \"assistant\", \"tool_calls\": [{\"id\": \"call_fKVkfhXbvnlVtL6nBehEJzpJ\", \"function\": {\"name\": \"get_car_price\", \"arguments\": \"{"car_name": "tang"}\"}, \"type\": \"function\"}]}, \"url\": \"https://errors.pydantic.dev/2.6/v/missing\"}, {\"type\": \"literal_error\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"role\"), \"msg\": "Input should be \"user\"", \"input\": \"assistant\", \"ctx\": {\"expected\": "\"user\""
}, \"url\": \"https://errors.pydantic.dev/2.6/v/literal_error\"}, {\"type\": \"missing\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"content\"), \"msg\": \"Field required\", \"input\": {\"role\": \"assistant\", \"tool_calls\": [{\"id\": \"call_fKVkfhXbvnlVtL6nBehEJzpJ\", \"function\": {\"name\": \"get_car_price\", \"arguments\": \"{"car_name": "tang"}\"}, \"type\": \"function\"}]}, \"url\": \"https://errors.pydantic.dev/2.6/v/missing\"}, {\"type\": \"missing\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"content\"), \"msg\": \"Field required\", \"input\": {\"role\": \"assistant\", \"tool_calls\": [{\"id\": \"call_fKVkfhXbvnlVtL6nBehEJzpJ\", \"function\": {\"name\": \"get_car_price\", \"arguments\": \"{"car_name": "tang"}\"}, \"type\": \"function\"}]}, \"url\": \"https://errors.pydantic.dev/2.6/v/missing\"}, {\"type\": \"literal_error\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"role\"), \"msg\": "Input should be \"tool\"", \"input\": \"assistant\", \"ctx\": {\"expected\": "\"tool\""
}, \"url\": \"https://errors.pydantic.dev/2.6/v/literal_error\"}, {\"type\": \"missing\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"content\"), \"msg\": \"Field required\", \"input\": {\"role\": \"assistant\", \"tool_calls\": [{\"id\": \"call_fKVkfhXbvnlVtL6nBehEJzpJ\", \"function\": {\"name\": \"get_car_price\", \"arguments\": \"{"car_name": "tang"}\"}, \"type\": \"function\"}]}, \"url\": \"https://errors.pydantic.dev/2.6/v/missing\"}, {\"type\": \"missing\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"tool_call_id\"), \"msg\": \"Field required\", \"input\": {\"role\": \"assistant\", \"tool_calls\": [{\"id\": \"call_fKVkfhXbvnlVtL6nBehEJzpJ\", \"function\": {\"name\": \"get_car_price\", \"arguments\": \"{"car_name": "tang"}\"}, \"type\": \"function\"}]}, \"url\": \"https://errors.pydantic.dev/2.6/v/missing\"}, {\"type\": \"literal_error\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"role\"), \"msg\": "Input should be \"function\"", \"input\": \"assistant\", \"ctx\": {\"expected\": "\"function\""
}, \"url\": \"https://errors.pydantic.dev/2.6/v/literal_error\"}, {\"type\": \"missing\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"content\"), \"msg\": \"Field required\", \"input\": {\"role\": \"assistant\", \"tool_calls\": [{\"id\": \"call_fKVkfhXbvnlVtL6nBehEJzpJ\", \"function\": {\"name\": \"get_car_price\", \"arguments\": \"{"car_name": "tang"}\"}, \"type\": \"function\"}]}, \"url\": \"https://errors.pydantic.dev/2.6/v/missing\"}, {\"type\": \"missing\", \"loc\": (\"body\", \"messages\", 2, \"typed-dict\", \"name\"), \"msg\": \"Field required\", \"input\": {\"role\": \"assistant\", \"tool_calls\": [{\"id\": \"call_fKVkfhXbvnlVtL6nBehEJzpJ\", \"function\": {\"name\": \"get_car_price\", \"arguments\": \"{"car_name": "tang"}\"}, \"type\": \"function\"}]}, \"url\": \"https://errors.pydantic.dev/2.6/v/missing\"}]",
"type": "internal_server_error",
"param": None,
"code": None
}
}
Can you try with the latest version of llama-cpp-python? I think their developers made some changes previously such that it created this pydantic error. I'm on the latest official version - v0.2.61.
Hi - overall I'm getting my feet wet in the LLM world, came across this project via numerous references and am interested in trying it out. I've reviewed the docs and just need some guidance, they appear to be the same as what is in the readme. Lots of moving parts, different projects being referenced and a bit overwhelming.
I am on a Mac M3, so looks like the vLLM example is a no go for me.
I successfully ran the llama.cpp inference example locally and it produces the function json as expected.
I'd like to try the
chatlab
example but it won't run. (per the note in the README regarding this) I can't installchatlab==0.16.0
, the latest chatlab version yields a import error onConversation
.So here is what I'm trying to achieve and is not clear to me from the docs.
what should I run to get the functionary model exposed over the openai api interface? I assume something like the llama.cpp example that provides the inference over the functionary model, but long lived running in the background as a process? Perhaps I could just achieve the same by running the functionary model in something like LM Studio's server?
Then in a 2nd process, that is where I'd be running code that would act as the client mediating between a user and the llama.cpp endpoint (such as if I can get chatlab running)?
thank you!