mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference
https://localai.io
MIT License
25.23k stars 1.91k forks source link

Response needs to process functions [panic: Unrecognized schema: map[]] #2223

Open gericho opened 6 months ago

gericho commented 6 months ago

LocalAI version:

quay.io/go-skynet/local-ai:master-sycl-f16-ffmpeg

Environment, CPU architecture, OS, and Version:

Docker on Proxmox LXC w/ iGPU pass-through i3-N300 32GB RAM, LXC 6 cores, 16GB ram

Describe the bug

using Extended OpenAI Conversation Integration by @jekalmin here, LocalAI crashes when using a function generation model used fakezeta/Phi3-openvino-int8 (thank you @fakezeta !!)

NOTE: removing the function completely, the assistant works well for conversation

- spec:
    name: execute_services
    description: Use this function to execute service of devices in Home Assistant.
    parameters:
      type: object
      properties:
        list:
          type: array
          items:
            type: object
            properties:
              domain:
                type: string
                description: The domain of the service
              service:
                type: string
                description: The service to be called
              service_data:
                type: object
                description: The service data object to indicate what to control.
                properties:
                  entity_id:
                    type: string
                    description: The entity_id retrieved from available devices. It
                      must start with domain, followed by dot character.
                required:
                - entity_id
            required:
            - domain
            - service
            - service_data
  function:
    type: native
    name: execute_service

To Reproduce

once specified the entities list and state, asking turning on a light, make localAI crash

Expected behavior

Home Assistant using Extended OpenAI Conversation Integration, produces HA-compatible function calls.

Logs


6:39PM DBG Request received: {"model":"phi3","language":"","n":0,"top_p":1,"top_k":null,"temperature":0.7,"max_tokens":150,"echo":false,"batch":0,"ignore_eos":false,"repeat_penalty":0,"n_keep":0,"frequency_penalty":0,"presence_penalty":0,"tfz":null,"typical_p":null,"seed":null,"negative_prompt":"","rope_freq_base":0,"rope_freq_scale":0,"negative_prompt_scale":0,"use_fast_tokenizer":false,"clip_skip":0,"tokenizer":"","file":"","response_format":{},"size":"","prompt":null,"instruction":"","input":null,"stop":null,"messages":[{"role":"system","content":"I want you to act as smart home manager of Home Assistant.\nI will provide information of smart home along with a question, you will truthfully make correction or answer using information provided in one sentence in everyday language.\nThe current state of devices is provided in available devices.\nUse execute_services function to execute the action.\nDo not restate or appreciate what user says, rather make a quick inquiry.\nDo not ask any confirmation.\nRead the time in human readable format only.\nBe always very short on responses!\n\nCurrent Time: 2024-05-02 18:39:30.913070+02:00\n\nAvailable Devices:\n```csv\nentity_id,name,state,aliases\nlight.tavolo,Tavolo,off,table light/table\n```"},{"role":"user","content":"turn on table"}],"functions":[{"name":"execute_services","description":"Use this function to execute service of devices in Home Assistant.","parameters":{"properties":{"list":{"items":{"properties":{"domain":{"description":"The domain of the service","type":"string"},"service":{"description":"The service to be called","type":"string"},"service_data":{"description":"The service data object to indicate what to control.","properties":{"entity_id":{"description":"The entity_id retrieved from available devices. It must start with domain, followed by dot character.","type":"string"}},"required":["entity_id"],"type":"object"}},"required":["domain","service","service_data"],"type":"object"},"type":"array"}},"type":"object"}}],"function_call":"auto","stream":false,"mode":0,"step":0,"grammar":"","grammar_json_functions":null,"backend":"","model_base_name":""}
6:39PM DBG Configuration read: &{PredictionOptions:{Model:fakezeta/Phi-3-mini-128k-instruct-ov-int8 Language: N:0 TopP:0xc0000152c8 TopK:0xc000464708 Temperature:0xc0000152c0 Maxtokens:0xc000015298 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc000464738 TypicalP:0xc000464730 Seed:0xc000464778 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:phi3 F16:0xc0004646f8 Threads:0xc0004646f0 Debug:0xc0000156c8 Roles:map[] Embeddings:false Backend:transformers TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:true} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString:auto functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName: ParallelCalls:false NoGrammar:false ResponseRegex:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc000464728 MirostatTAU:0xc000464720 Mirostat:0xc000464718 NGPULayers:0xc000464748 MMap:0xc000464770 MMlock:0xc000464771 LowVRAM:0xc000464771 Grammar: StopWords:[<|end|>] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0004646d0 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:true EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: ModelType:OVModelForCausalLM YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[] Description: Usage:}
6:39PM DBG Response needs to process functions
panic: Unrecognized schema: map[]

goroutine 10 [running]:
github.com/go-skynet/LocalAI/pkg/functions.(*JSONSchemaConverter).visit(0xc0003f1f70, 0xc0002ae210, {0x0, 0x0}, 0xc0002ae210)
        /build/pkg/functions/grammar_json_schema.go:224 +0x69b
github.com/go-skynet/LocalAI/pkg/functions.(*JSONSchemaConverter).Grammar(0xc0003f1f70, 0x2?, 0x0)
        /build/pkg/functions/grammar_json_schema.go:255 +0x25
github.com/go-skynet/LocalAI/pkg/functions.(*JSONSchemaConverter).GrammarFromBytes(0xc0003f1f70, {0xc0002f71c8, 0x2, 0x8}, 0x0)
        /build/pkg/functions/grammar_json_schema.go:262 +0x6d
github.com/go-skynet/LocalAI/pkg/functions.JSONFunctionStructure.Grammar({{0x0, 0x0, 0x0}, {0x0, 0x0, 0x0}, 0x0}, {0x0, 0x0}, 0x0)
        /build/pkg/functions/grammar_json_schema.go:297 +0xa8
github.com/go-skynet/LocalAI/core/http/endpoints/openai.ChatEndpoint.func3(0xc0002f2008)
        /build/core/http/endpoints/openai/chat.go:220 +0xa29
github.com/gofiber/fiber/v2.(*Ctx).Next(0xc000110540?)
        /root/go/pkg/mod/github.com/gofiber/fiber/v2@v2.52.4/ctx.go:1027 +0x3d
github.com/go-skynet/LocalAI/core/http.App.func4(0xc000156000?)
        /build/core/http/app.go:122 +0x1d4
github.com/gofiber/fiber/v2.(*App).next(0xc0001a2a08, 0xc0002f2008)
        /root/go/pkg/mod/github.com/gofiber/fiber/v2@v2.52.4/router.go:145 +0x1be
github.com/gofiber/fiber/v2.(*Ctx).Next(0xc0002f2008?)
        /root/go/pkg/mod/github.com/gofiber/fiber/v2@v2.52.4/ctx.go:1030 +0x4d
github.com/go-skynet/LocalAI/core/http.App.LocalAIMetricsAPIMiddleware.func7(0xc0002f2008)
        /build/core/http/endpoints/localai/metrics.go:38 +0xa5
github.com/gofiber/fiber/v2.(*Ctx).Next(0xc0002f2008?)
        /root/go/pkg/mod/github.com/gofiber/fiber/v2@v2.52.4/ctx.go:1027 +0x3d
github.com/gofiber/contrib/fiberzerolog.New.func1(0xc0002f2008)
        /root/go/pkg/mod/github.com/gofiber/contrib/fiberzerolog@v1.0.0/zerolog.go:36 +0xb7
github.com/gofiber/fiber/v2.(*App).next(0xc0001a2a08, 0xc0002f2008)
        /root/go/pkg/mod/github.com/gofiber/fiber/v2@v2.52.4/router.go:145 +0x1be
github.com/gofiber/fiber/v2.(*App).handler(0xc0001a2a08, 0x49b56f?)
        /root/go/pkg/mod/github.com/gofiber/fiber/v2@v2.52.4/router.go:172 +0x78
github.com/valyala/fasthttp.(*Server).serveConn(0xc000120200, {0xe0452b8, 0xc0002b4000})
        /root/go/pkg/mod/github.com/valyala/fasthttp@v1.51.0/server.go:2359 +0xe70
github.com/valyala/fasthttp.(*workerPool).workerFunc(0xc00027d220, 0xc0003d4040)
        /root/go/pkg/mod/github.com/valyala/fasthttp@v1.51.0/workerpool.go:224 +0xa4
github.com/valyala/fasthttp.(*workerPool).getCh.func1()
        /root/go/pkg/mod/github.com/valyala/fasthttp@v1.51.0/workerpool.go:196 +0x32
created by github.com/valyala/fasthttp.(*workerPool).getCh in goroutine 1
        /root/go/pkg/mod/github.com/valyala/fasthttp@v1.51.0/workerpool.go:195 +0x190

Additional context

localai-bot commented 6 months ago

I apologize for the inconvenience. It seems like there might be an issue with the schema processing. I suggest trying to update LocalAI to the latest version or reinstalling it to resolve this issue. If the problem persists, please open a new ticket with specific details about the debug logs and the steps to reproduce the issue, so we can look into this further. Thank you for bringing this to our attention, and I assure you that we'll work on solving this problem as quickly as possible.

Xav-v commented 6 months ago

Same issue there, Tested with other models, e.g Hermes-2-Pro-Llama-3-8B-GGUF or Home-3B-v3-GGUF are producing the same result panic: Unrecognized schema: map[]

gericho commented 6 months ago

Unfortunately, it is still an issue with the 2024-05-07T11:38:35.997Z

xjm1285 commented 6 months ago

Same issue,follow the example of function .

R3dC4p commented 6 months ago

Also facing this issue. Occurs regardless of model used, Ive tried several that are supposed to work with a function. Should be noted that when interfacing with openais official API instead, it works as expected.

fakezeta commented 6 months ago

@mudler can you take a look? It seems something common to all the backends and models.

mudler commented 6 months ago

Is this happening only with transformers models I suppose, right? function calls are automatically tested by the CI (however, just with llama.cpp as runs easily on the runners).

The functions calls maps automatically to grammars which are currently supported only by llama.cpp, however, I think you should be able to disable that behavior by turning of grammars, and extract tool arguments from the LLM responses, by specifying in the YAML file:

function:
  no_grammar: true
  response_regex: "..."

The response regex have to be a regex with named parameters to allow to scan the function name and the arguments. For instance, consider:

(?P<function>\w+)\s*\((?P<arguments>.*)\)

will catch

function_name({ "foo": "bar"})

Update: I've updated the docs now to mention this specific setting here: https://localai.io/features/openai-functions/#use-functions-without-grammars

maxi1134 commented 6 months ago

Anyone know how to apply this "no_grammar" fix to the OpenAI extended addon for HA?

I added the YAML lines to my model.yaml with no luck:

name: gpt-4
mmap: true
parameters:
  model: huggingface://NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
function:
  # set to true to not use grammars
  no_grammar: true
  # set a regex to extract the function tool arguments from the LLM response
  response_regex: "(?P<function>\w+)\s*\((?P<arguments>.*)\)"
template:
  chat_message: |
    <|im_start|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}
    {{- if .FunctionCall }}
    <tool_call>
    {{- else if eq .RoleName "tool" }}
    <tool_response>
    {{- end }}
    {{- if .Content}}
    {{.Content }}
    {{- end }}
    {{- if .FunctionCall}}
    {{toJson .FunctionCall}}
    {{- end }}
    {{- if .FunctionCall }}
    </tool_call>
    {{- else if eq .RoleName "tool" }}
    </tool_response>
    {{- end }}<|im_end|>
  # https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B-GGUF#prompt-format-for-function-calling
  function: |
    <|im_start|>system
    You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into >
    <tools>
    {{range .Functions}}
    {'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}
    {{end}}
    </tools>
    Use the following pydantic model json schema for each tool call you will make:
    {'title': 'FunctionCall', 'type': 'object', 'properties': {'arguments': {'title': 'Arguments', 'type': 'object'}, 'name': {'title': 'Name', 'type': 'string'}}, 'required': ['arguments', 'name']}
    For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:
    <tool_call>
    {'arguments': <args-dict>, 'name': <function-name>}
    </tool_call><|im_end|>
    {{.Input -}}
    <|im_start|>assistant
    <tool_call>
  chat: |
  ......

Edit1:

I tried with this one as well with no luck:

context_size: 4096
threads: 8
f16: true
gpu_layers: 36
low_vram: false
mmap: false
mmlock: false
name: LaserDolphinMix14bQ6
parameters:
  model: fc-dolphin-2.6-mistral-7b-dpo-laser.Q6_K.gguf
  temperature: 0.2
stopwords:
- "user|"
- "assistant|"
- "system|"
- "<|im_end|>"
- "<|im_start|>"
template:
  chat: laser-chat
  chat_message: laser-chat-block
  completion: laser-completion
function:
  # set to true to not use grammars
  no_grammar: true
  # set a regex to extract the function tool arguments from the LLM response
  response_regex: "(?P<function>\w+)\s*\((?P<arguments>.*)\)"

Setting functions to "0" works, but setting it to 1 makes it crash with the MAP error.

R3dC4p commented 6 months ago

If it will help, This is the the request body sent by jeklamin's integration:

POST /test/chat/completions HTTP/1.1
Host: r3dc4p.requestcatcher.com
Accept: application/json
Accept-Encoding: gzip, deflate, br
Authorization: Bearer -
Connection: keep-alive
Content-Length: 2897
Content-Type: application/json
User-Agent: AsyncOpenAI/Python 1.3.8
X-Stainless-Arch: x64
X-Stainless-Async: async:asyncio
X-Stainless-Lang: python
X-Stainless-Os: Linux
X-Stainless-Package-Version: 1.3.8
X-Stainless-Runtime: CPython
X-Stainless-Runtime-Version: 3.12.2
{
    "messages": [
        {
            "role": "system",
            "content": "I want you to act as smart home manager of Home Assistant.\nI will provide information of smart home along with a question, you will truthfully make correction or answer using information provided in one sentence in everyday language.\n(CSV of Home Assistant stuff here)```\n\nThe current state of devices is provided in available devices.\nUse execute_services function only for requested action, not for current states.\nDo not execute service without user's confirmation.\nDo not restate or appreciate what user says, rather make a quick inquiry."
        },
        {
            "role": "user",
            "content": "test"
        }
    ],
    "model": "gpt4",
    "function_call": "auto",
    "functions": [
        {
            "name": "execute_services",
            "description": "Use this function to execute service of devices in Home Assistant.",
            "parameters": {
                "type": "object",
                "properties": {
                    "list": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "domain": {
                                    "type": "string",
                                    "description": "The domain of the service"
                                },
                                "service": {
                                    "type": "string",
                                    "description": "The service to be called"
                                },
                                "service_data": {
                                    "type": "object",
                                    "description": "The service data object to indicate what to control.",
                                    "properties": {
                                        "entity_id": {
                                            "type": "string",
                                            "description": "The entity_id retrieved from available devices. It must start with domain, followed by dot character."
                                        }
                                    },
                                    "required": [
                                        "entity_id"
                                    ]
                                }
                            },
                            "required": [
                                "domain",
                                "service",
                                "service_data"
                            ]
                        }
                    }
                }
            }
        }
    ],
    "max_tokens": 150,
    "temperature": 0.5,
    "top_p": 1,
    "user": "01HXJDWGCR4PHR95BA9NDG9JVX"
}

And from the LocalAI Log:

[90m7:59PM ERR Server error error="failed reading parameters from request:failed parsing request body: unexpected end of JSON input" ip=10.0.0.17 latency="98.493µs" method=POST status=500 url=/v1/chat/completions

Edited cause I was figuring out Github's Markdown

R3dC4p commented 6 months ago

Okay, I did some more digging. I found evidence of this working as recently as february on a blog post, and proceeded to pull 2.8.2 and implement is as described here: https://theawesomegarage.com/blog/configure-a-local-llm-to-control-home-assistant-instead-of-chatgpt . Requests work fine in this release, while the same model and config does not work on latest. Some change between then and now seems to have broken the functionality.

xjm1285 commented 6 months ago

I found that removing function_call from the request can avoid this issue.

R3dC4p commented 6 months ago

Well yeah, but then that kind of defeats the purpose. You need the function call to actually control ha.

maxi1134 commented 6 months ago

It seems to work with this older version: 2.8.2

https://github.com/jekalmin/extended_openai_conversation/issues/213#issuecomment-2105547979

fakezeta commented 6 months ago

Is this happening only with transformers models I suppose, right? function calls are automatically tested by the CI (however, just with llama.cpp as runs easily on the runners).

The functions calls maps automatically to grammars which are currently supported only by llama.cpp, however, I think you should be able to disable that behavior by turning of grammars, and extract tool arguments from the LLM responses, by specifying in the YAML file:

function:
  no_grammar: true
  response_regex: "..."

The response regex have to be a regex with named parameters to allow to scan the function name and the arguments. For instance, consider:

(?P<function>\w+)\s*\((?P<arguments>.*)\)

will catch

function_name({ "foo": "bar"})

Update: I've updated the docs now to mention this specific setting here: https://localai.io/features/openai-functions/#use-functions-without-grammars

Several reports are with llama.cpp, this is why I doubt it's a transformer issue.

xjm1285 commented 6 months ago

Okay, I did some more digging. I found evidence of this working as recently as february on a blog post, and proceeded to pull 2.8.2 and implement is as described here: https://theawesomegarage.com/blog/configure-a-local-llm-to-control-home-assistant-instead-of-chatgpt . Requests work fine in this release, while the same model and config does not work on latest. Some change between then and now seems to have broken the functionality.

2.8.2 seems to work fine

VitaminTe commented 6 months ago

No luck for me on the proposed fix. I'm working to pull 2.8.2 and give that a try.

maxi1134 commented 6 months ago

Okay, I did some more digging. I found evidence of this working as recently as february on a blog post, and proceeded to pull 2.8.2 and implement is as described here: https://theawesomegarage.com/blog/configure-a-local-llm-to-control-home-assistant-instead-of-chatgpt . Requests work fine in this release, while the same model and config does not work on latest. Some change between then and now seems to have broken the functionality.

2.8.2 seems to work fine

Are you able to get the functions to work at all? For me it doesn't do the error but functions are not called I think

xjm1285 commented 6 months ago

Okay, I did some more digging. I found evidence of this working as recently as february on a blog post, and proceeded to pull 2.8.2 and implement is as described here: https://theawesomegarage.com/blog/configure-a-local-llm-to-control-home-assistant-instead-of-chatgpt . Requests work fine in this release, while the same model and config does not work on latest. Some change between then and now seems to have broken the functionality.

2.8.2 seems to work fine

Are you able to get the functions to work at all? For me it doesn't do the error but functions are not called I think

I just tested the example of function, blow is the two responses: first response:

[{'role': 'user', 'content': "What's the weather like in Boston?"}, ChatCompletionMessage(content=None, role='assistant', function_call=FunctionCall(arguments='{"location":"Boston","unit":"celsius"}', name='get_current_weather', function='get_current_weather'), tool_calls=None), {'role': 'function', 'name': 'get_current_weather', 'content': '{"location": "Boston", "temperature": "72", "unit": "celsius", "forecast": ["sunny", "windy"]}'}] 

second respone:

ChatCompletion(id='e7b13299-9b1a-41d3-b7a6-d5e1dc8147b8', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='The current temperature in Boston is 72 degrees Celsius. The weather is sunny and windy.', role='assistant', function_call=None, tool_calls=None))], created=1715612514, model='hermes-2-pro-llama-3-8b', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=0, prompt_tokens=0, total_tokens=0)) 

seems to work fine.

maxi1134 commented 6 months ago

Okay, I did some more digging. I found evidence of this working as recently as february on a blog post, and proceeded to pull 2.8.2 and implement is as described here: https://theawesomegarage.com/blog/configure-a-local-llm-to-control-home-assistant-instead-of-chatgpt . Requests work fine in this release, while the same model and config does not work on latest. Some change between then and now seems to have broken the functionality.

2.8.2 seems to work fine

Are you able to get the functions to work at all? For me it doesn't do the error but functions are not called I think

I just tested the example of function, blow is the two responses: first response:

[{'role': 'user', 'content': "What's the weather like in Boston?"}, ChatCompletionMessage(content=None, role='assistant', function_call=FunctionCall(arguments='{"location":"Boston","unit":"celsius"}', name='get_current_weather', function='get_current_weather'), tool_calls=None), {'role': 'function', 'name': 'get_current_weather', 'content': '{"location": "Boston", "temperature": "72", "unit": "celsius", "forecast": ["sunny", "windy"]}'}] 

second respone:

ChatCompletion(id='e7b13299-9b1a-41d3-b7a6-d5e1dc8147b8', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='The current temperature in Boston is 72 degrees Celsius. The weather is sunny and windy.', role='assistant', function_call=None, tool_calls=None))], created=1715612514, model='hermes-2-pro-llama-3-8b', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=0, prompt_tokens=0, total_tokens=0)) 

seems to work fine.

I see!

What about the "execute_services" one?

I tried something simple like "Turn off the office lights" which works with OpenAI with the same configuration. But no luck with LocalAI in 2.8.2

This is my function:

- spec:
    name: execute_services
    description: Use this function to execute service of devices in Home Assistant.
    parameters:
      type: object
      properties:
        list:
          type: array
          items:
            type: object
            properties:
              domain:
                type: string
                description: The domain of the service
              service:
                type: string
                description: The service to be called
              service_data:
                type: object
                description: The service data object to indicate what to control.
                properties:
                  entity_id:
                    type: string
                    description: The entity_id retrieved from available devices. It must start with domain, followed by dot character.
                required:
                - entity_id
            required:
            - domain
            - service
            - service_data
  function:
    type: native
    name: execute_service
R3dC4p commented 6 months ago

Whether or not the functions work depends heavily on the model and whether you have the entity exposed.

remimikalsen commented 6 months ago

Hi, I'm the author of the blog post mentioned above (theawesomegarage). I had the Extended OpenAI Conversation Integration working back in February, installed local.ai just now on a new server with he latest version of LocalAI, and I get the same panic error, and my localai docker container crashes.

The Home Assistant integration works as expected if I instead use the actual OpenAI API as my back-end or the old version of local.ai. If I disable function calls in the Extended OpenAI Conversation Integration, local-ai doesn't crash anymore, but I can't interact with Home Assistant either - just have a pleasant conversation with the AI.

R3dC4p commented 5 months ago

Hi, I'm the author of the blog post mentioned above (theawesomegarage). I had the Extended OpenAI Conversation Integration working back in February, installed local.ai just now on a new server with he latest version of LocalAI, and I get the same panic error, and my localai docker container crashes.

The Home Assistant integration works as expected if I instead use the actual OpenAI API as my back-end or the old version of local.ai. If I disable function calls in the Extended OpenAI Conversation Integration, local-ai doesn't crash anymore, but I can't interact with Home Assistant either - just have a pleasant conversation with the AI.

Good to see you here! Have to thank you for that post, cause it got me started on the concept of local chatgpt integration with home assistant. I have found an alternative to localai for the moment... Using Ollama with the Llama Conversation custom integration by acon96. I do hope the Local AI compatibility gets sorted though, as I prefer it over Ollama

mudler commented 5 months ago

I can't reproduce it here - however in the new LocalAI version I focused on enhancing tool support:

now you can also disable grammars if the model support entirely. I've tested with hermes, and for example if this issue persist when using grammars, you can disable it as such (but note that if the LLM fails in replying with valid JSON by hallucinating it will break):

name: nous-hermes
mmap: true
parameters:
  model: huggingface://NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
context_size: 8192

stopwords:
- "<|im_end|>"
- "<dummy32000>"
- "</tool_call>"
- "<|eot_id|>"
- "<|end_of_text|>"

function:
  # disable injecting the "answer" tool
  disable_no_action: true

  grammar:
    # This allows the grammar to also return messages
    #mixed_mode: true
    disable: true
    # Suffix to add to the grammar
    #prefix: '<tool_call>\n'
    # Force parallel calls in the grammar
    # parallel_calls: true

  return_name_in_function_response: true
  # Without grammar uncomment the lines below
  # Warning: this is relying only on the capability of the
  # LLM model to generate the correct function call.
  json_regex_match: 
   - "(?s)<tool_call>(.*?)</tool_call>"
   - "(?s)<tool_call>(.*?)"
  replace_llm_results:
  # Drop the scratchpad content from responses
  - key: "(?s)<scratchpad>.*</scratchpad>"
    value: ""
  replace_function_results: 
  # Replace everything that is not JSON array or object
  # 
  - key: '(?s)^[^{\[]*'
    value: ""
  - key: '(?s)[^}\]]*$'
    value: ""
  - key: "'([^']*?)'"
    value: "_DQUOTE_${1}_DQUOTE_"
  - key: '\\"'
    value: "__TEMP_QUOTE__"
  - key: "\'"
    value: "'"
  - key: "_DQUOTE_"
    value: '"'
  - key: "__TEMP_QUOTE__"
    value: '"'
  # Drop the scratchpad content from responses
  - key: "(?s)<scratchpad>.*</scratchpad>"
    value: ""

template:
  chat: |
    {{.Input -}}
    <|im_start|>assistant
  chat_message: |
    <|im_start|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}
    {{- if .FunctionCall }}
    <tool_call>
    {{- else if eq .RoleName "tool" }}
    <tool_response>
    {{- end }}
    {{- if .Content}}
    {{.Content }}
    {{- end }}
    {{- if .FunctionCall}}
    {{toJson .FunctionCall}}
    {{- end }}
    {{- if .FunctionCall }}
    </tool_call>
    {{- else if eq .RoleName "tool" }}
    </tool_response>
    {{- end }}<|im_end|>
  completion: |
    {{.Input}}
  function: |-
    <|im_start|>system
    You are a function calling AI model.
    Here are the available tools:
    <tools>
    {{range .Functions}}
    {'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}
    {{end}}
    </tools>
    You should call the tools provided to you sequentially
    Please use <scratchpad> XML tags to record your reasoning and planning before you call the functions as follows:
    <scratchpad>
    {step-by-step reasoning and plan in bullet points}
    </scratchpad>
    For each function call return a json object with function name and arguments within <tool_call> XML tags as follows:
    <tool_call>
    {"arguments": <args-dict>, "name": <function-name>}
    </tool_call><|im_end|>
    {{.Input -}}
    <|im_start|>assistant

To note: I've updated the hermes models in the gallery with the mixed JSON grammar support that was introduced in 2.16.0 and feedback is welcome.

You can also find now two new models in the model gallery that are fine-tuned to leverage entirely JSON grammar support of LocalAI that you can find both in the model gallery:

maxi1134 commented 5 months ago

I am now trying to the mistral-7b-instruct-v0.3, as it seems to output the right services, but it doesn't work still, this is what I see:

image

Xav-v commented 5 months ago

Same but different issue with this model:

local-ai    | 9:14AM DBG ParseTextContent:  [{"name": "execute_services", "arguments": {"domain": "script", "service": "turnoff", "service_data": {"entity_id": "binary_sensor.hdd_status"}}]
local-ai    | 9:14AM DBG CaptureLLMResult: []
local-ai    | 9:14AM DBG LLM result:  [{"name": "execute_services", "arguments": {"domain": "script", "service": "turnoff", "service_data": {"entity_id": "binary_sensor.hdd_status"}}]
local-ai    | 9:14AM DBG LLM result(processed):  [{"name": "execute_services", "arguments": {"domain": "script", "service": "turnoff", "service_data": {"entity_id": "binary_sensor.hdd_status"}}]
local-ai    | 9:14AM DBG LLM result:  [{"name": "execute_services", "arguments": {"domain": "script", "service": "turnoff", "service_data": {"entity_id": "binary_sensor.hdd_status"}}]
local-ai    | 9:14AM DBG Replacing (?s)^[^{\[]* with 
local-ai    | 9:14AM DBG Replacing (?s)[^}\]]*$ with 
local-ai    | 9:14AM DBG Replacing (?s)\[TOOL\_CALLS\] with 
local-ai    | 9:14AM DBG Replacing (?s)\[\/TOOL\_CALLS\] with 
local-ai    | 9:14AM DBG LLM result(function cleanup): [{"name": "execute_services", "arguments": {"domain": "script", "service": "turnoff", "service_data": {"entity_id": "binary_sensor.hdd_status"}}]
local-ai    | 9:14AM DBG unable to unmarshal llm result in a single object or an array of JSON objects error="invalid character ']' after object key:value pair" escapedLLMResult="[{\"name\": \"execute_services\", \"arguments\": {\"domain\": \"script\", \"service\": \"turnoff\", \"service_data\": {\"entity_id\": \"binary_sensor.hdd_status\"}}]"
local-ai    | 9:14AM DBG Function return: [{"name": "execute_services", "arguments": {"domain": "script", "service": "turnoff", "service_data": {"entity_id": "binary_sensor.hdd_status"}}] []
local-ai    | 9:14AM DBG Text content to return: 
local-ai    | 9:14AM DBG nothing function results but we had a message from the LLM
local-ai    | 9:14AM DBG Response: {"created":1717492093,"object":"chat.completion","id":"b6674bee-7c82-4362-a2cf-97fb17f0522b","model":"mistral-7b-instruct-v0.3","choices":[{"index":0,"finish_reason":"","message":{"role":"assistant","content":" [{\"name\": \"execute_services\", \"arguments\": {\"domain\": \"script\", \"service\": \"turnoff\", \"service_data\": {\"entity_id\": \"binary_sensor.hdd_status\"}}]"}}],"usage":{"prompt_tokens":1020,"completion_tokens":47,"total_tokens":1067}}

Model: mistral-7b-instruct-v0.3 yaml: by default (models/mistral-7b-instruct-v0.3.yaml) Error: DBG unable to unmarshal llm result in a single object or an array of JSON objects error="invalid character ']' after object key:value pair" escapedLLMResult="[{\"name\": \"execute_services\", \"arguments\": {\"domain\": \"script\", \"service\": \"turnoff\", \"service_data\": {\"entity_id\": \"binary_sensor.hdd_status\"}}]"

cesinsingapore commented 5 months ago

this also happen with llama3-7b model

yonitjio commented 5 months ago

I don't know if this helps since I don't use Extended OpenAI Conversation.

In my case, this occurs if the array items type is object.

I "fixed" it with using $defs and $ref. Something like this:

"parameters":
  "type":"object",
  "properties":{
    "object_ids": {
      "type":"array"
      "description":"Object Ids",
      "$defs":{
        "ValueObject":{
          "title":"ValueObject",
          "type":"object"
          "properties":{
            "value":{
              "title":"Value",
              "type":"integer"
            }
          },
          "required":["value"],
        }
      },
      "items":{
        "$ref":"#/$defs/ValueObject"
      },
    }
  },
  "required":["object_ids"]

But since I can't add it to the root schema, I changed this line https://github.com/mudler/LocalAI/blob/d38e9090df32dfd239b36d3ae284b5d8ec87b7a5/pkg/functions/grammar_json_schema.go#L298

From

itemRuleName := sc.visit(items, fmt.Sprintf("%s-item", ruleName), rootSchema)

to

itemRuleName := sc.visit(items, fmt.Sprintf("%s-item", ruleName), schema)
Artekus commented 4 months ago

I was able to get it to not crash the container by changing this to all one line and setting the option to use tools.

                    description: The entity_id retrieved from available devices. It
                      must start with domain, followed by dot character.

as

                   description: The entity_id retrieved from available devices. It must start with domain, followed by dot character.