Local model via llama-cpp-python support

luzik commented 9 months ago

As llama.cpp is now best backend for opensource models, and llama-cpp-python (used as python software backend for python powered GUIs) have buildin OpenAI API support with function (tools) calling support.

https://llama-cpp-python.readthedocs.io/en/latest/server/#function-calling https://github.com/abetlen/llama-cpp-python#function-calling

and there are docker support of this tool, I wanted to get support with running this things all together

I have read https://github.com/jekalmin/extended_openai_conversation/issues/17 but that is mostly about LocalAI. LocalAI is using llama-cpp-python as backend, so why not to go shortcut and use llama-cpp-python directly ?

My docker-compose looks like this (with llama-cpp-python git cloned, if you do not need GPU support just use commented #image instead of build:)

version: '3.4'
services:
  llama-cpp-python:
    container_name: llama-cpp-python
    #image: ghcr.io/abetlen/llama-cpp-python:latest
    build: llama-cpp-python/docker/cuda_simple  # docker-compose build --no-cache
    environment:
      #- MODEL=/models/sha256:6ae28029995007a3ee8d0b8556d50f3b59b831074cf19c84de87acf51fb54054
      #- MODEL=/models/openchat_3.5-16k.Q4_K_M.gguf
      #- MODEL=/models/zephyr-7b-beta.Q5_K_M.gguf
      #- MODEL=/models/starling-lm-7b-alpha.Q5_K_M.gguf
      #- MODEL=/models/wizardcoder-python-13b-v1.0.Q4_K_M.gguf
      #- MODEL=/models/deepseek-coder-6.7b-instruct.Q5_K_M.gguf
      #- MODEL=/models/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf
      #- MODEL=/models/phi-2.Q5_K_M.gguf
      - MODEL=/models/functionary-7b-v1.Q4_K_S.gguf
      - USE_MLOCK=0
    ports:
      - 8008:8000
    volumes:
      - ./models:/models
    restart: on-failure:0
    cap_add:
      - SYS_RESOURCE
    deploy:
        resources:
          reservations:
            devices:
              - driver: nvidia
                device_ids: ['0']
                capabilities: [gpu]
    command: python3 -m llama_cpp.server --n_gpu_layers 33 --n_ctx 18192 --chat_format functionary

But I've got answers like:

turn on "wyspa" light Something went wrong: Service light.on not found. where is paris? Something went wrong: Service location.navigate not found.

Maybe something wrong with my prompt ?

jekalmin commented 9 months ago

turn on "wyspa" light Something went wrong: Service light.on not found. where is paris? Something went wrong: Service location.navigate not found.

I haven't tried llama-cpp-python yet, but the error message above happens when LLM tries to call service with light.on which should be light.turn_on.

Since I don't know much about LLM, I have no right answer. What I can assume is the model you are using hasn't trained HA data much. If this is the case, trying different model might help.

I will try this as well later!

Also, I want to know what prompt you have used, probably default prompt?

luzik commented 9 months ago

I think that my model do not know anything about HomeAssistant. Is there a way to provide service names with description in "tool spec"? For example about domain light with list of services ?

jekalmin commented 9 months ago

I think that my model do not know anything about HomeAssistant.

I think so.

Is there a way to provide service names with description in "tool spec"? For example about domain light with list of services ?

Maybe you can try setting enum to domain and service.

- spec:
    name: execute_services
    description: Use this function to execute service of devices in Home Assistant.
    parameters:
      type: object
      properties:
        list:
          type: array
          items:
            type: object
            properties:
              domain:
                type: string
                description: The domain of the service
                enum:
                  - light
                  - switch
              service:
                type: string
                description: The service to be called
                enum:
                  - turn_on
                  - turn_off
              service_data:
                type: object
                description: The service data object to indicate what to control.
                properties:
                  entity_id:
                    type: array
                    items:
                      type: string
                      description: The entity_id retrieved from available devices. It must start with domain, followed by dot character.
                required:
                - entity_id
            required:
            - domain
            - service
            - service_data
  function:
    type: native
    name: execute_service

luzik commented 9 months ago

Ok, after model change and those fixes I've got HA error: Something went wrong: function ' execute_services' does not exist. This is because of extra space ?

My debug shows:

llama-cpp-python    | user:
llama-cpp-python    | </s>turn on kuchnia light</s>
llama-cpp-python    | assistant execute_services:
llama-cpp-python    |
llama-cpp-python    | {
llama-cpp-python    |   "list": [
llama-cpp-python    |     {
llama-cpp-python    |       "domain": "light",
llama-cpp-python    |       "service": "turn_on",
llama-cpp-python    |       "service_data": {
llama-cpp-python    |         "entity_id": "light.kuchnia"
llama-cpp-python    |       }
llama-cpp-python    |     }
llama-cpp-python    |   ]
llama-cpp-python    | }

Maybe we can trim extra chars from function names ?

jekalmin commented 9 months ago

Something went wrong: function ' execute_services' does not exist. This is because of extra space ?

I think so.

Maybe we can trim extra chars from function names ?

Without modification of code, it's not possible. However, even if it works, it would not be satisfactory if the model hasn't trained HA data.

Since providing enums in spec is just a workaround, it would result in problems after problems. Maybe looking for a model that has trained HA data or a way to fine tune model would be the better approach.

luzik commented 9 months ago

I've changed model and now it do not need enum anymore.

luzik commented 9 months ago

Maybe I can try to fix this trim issue by myself, can you help me finding right place to start within you code ?

jekalmin commented 9 months ago

I'm not certain where to put, but this is the place that compares function names.

luzik commented 9 months ago

Thanks! But have to dig more.. any clues on those logs ?

homeassistant             | 2024-01-04 13:16:30.163 INFO (MainThread) [custom_components.extended_openai_conversation] Response {
homeassistant             |   "choices": [
homeassistant             |     {
homeassistant             |       "finish_reason": "tool_calls",
homeassistant             |       "index": 0,
homeassistant             |       "message": {
homeassistant             |         "content": null,
homeassistant             |         "function_call": {
homeassistant             |           "arguments": "{\n  \"list\": [\n    {\n      \"domain\": \"light\",\n      \"service\": \"turn_on\",\n      \"service_data\": {\n        \"entity_id\": \"light.kanapa\"\n      }\n    }\n  ]\n}      \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
homeassistant             |           "name": ": execute_services"
homeassistant             |         },
homeassistant             |         "role": "assistant",
homeassistant             |         "tool_calls": [
homeassistant             |           {
homeassistant             |             "function": {
homeassistant             |               "arguments": "{\n  \"list\": [\n    {\n      \"domain\": \"light\",\n      \"service\": \"turn_on\",\n      \"service_data\": {\n        \"entity_id\": \"light.kanapa\"\n      }\n    }\n  ]\n}      \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n",
homeassistant             |               "name": ": execute_services"
homeassistant             |             },
homeassistant             |             "id": ": execute_services",
homeassistant             |             "type": "function"
homeassistant             |           }
homeassistant             |         ]
homeassistant             |       }
homeassistant             |     }
homeassistant             |   ],
homeassistant             |   "created": 1704370580,
homeassistant             |   "id": "XXXX",
homeassistant             |   "model": "gpt-3.5-turbo",
homeassistant             |   "object": "chat.completion",
homeassistant             |   "usage": {
homeassistant             |     "completion_tokens": 150,
homeassistant             |     "prompt_tokens": 1858,
homeassistant             |     "total_tokens": 2008
homeassistant             |   }
homeassistant             | }
homeassistant             | 2024-01-04 13:16:30.166 ERROR (MainThread) [custom_components.extended_openai_conversation] native function 'execute_services' does not exist
homeassistant             | Traceback (most recent call last):
homeassistant             |   File "/config/custom_components/extended_openai_conversation/__init__.py", line 179, in async_process
homeassistant             |     response = await self.query(user_input, messages, exposed_entities, 0)
homeassistant             |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
homeassistant             |   File "/config/custom_components/extended_openai_conversation/__init__.py", line 316, in query
homeassistant             |     message = await self.execute_function_call(
homeassistant             |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
homeassistant             |   File "/config/custom_components/extended_openai_conversation/__init__.py", line 361, in execute_function
homeassistant             |     result = await function_executor.execute(
homeassistant             |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
homeassistant             |   File "/config/custom_components/extended_openai_conversation/helpers.py", line 228, in execute
homeassistant             |     raise NativeNotFound(name)
homeassistant             | custom_components.extended_openai_conversation.exceptions.NativeNotFound: native function 'execute_services' does not exist

jekalmin commented 9 months ago

Maybe this is execute_services in your config, which should be execute_service?

luzik commented 9 months ago

Yeah I was think that this was an error and changed this to execute_serices, thanks!

Now with extra message["function_call"]["name"] = message["function_call"]["name"].strip(' :')

function calling is working ok, but after calling there is response: homeassistant | 2024-01-04 14:21:24.395 INFO (MainThread) [custom_components.extended_openai_conversation] Prompt for gpt-3.5-turbo: [{'role': 'system', 'content': "You[.....]tion you need."}, {'role': 'user', 'content': 'turn wyspa off'}, {'role': 'function', 'name': 'execute_services', 'content': '[True]'}] This is some kind of confirm ? Because this gives


homeassistant             | Traceback (most recent call last):
homeassistant             |   File "/config/custom_components/extended_openai_conversation/__init__.py", line 179, in async_process
homeassistant             |     response = await self.query(user_input, messages, exposed_entities, 0)
homeassistant             |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
homeassistant             |   File "/config/custom_components/extended_openai_conversation/__init__.py", line 316, in query
homeassistant             |     message = await self.execute_function_call(
homeassistant             |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
homeassistant             |   File "/config/custom_components/extended_openai_conversation/__init__.py", line 377, in execute_function
homeassistant             |     return await self.query(user_input, messages, exposed_entities, n_requests)
homeassistant             |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
homeassistant             |   File "/config/custom_components/extended_openai_conversation/__init__.py", line 316, in query
homeassistant             |     message = await self.execute_function_call(
homeassistant             |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
homeassistant             |   File "/config/custom_components/extended_openai_conversation/__init__.py", line 344, in execute_function_call
homeassistant             |     raise FunctionNotFound(message["function_call"]["name"])
homeassistant             | custom_components.extended_openai_conversation.exceptions.FunctionNotFound: function 'none' doe
```s not exist

jekalmin commented 9 months ago

After function is called, it makes another request to LLM to get response message. However it seems that this model tries to call another function named "none" which doesn't exist.

Probably it's not aware that function call succeeded even though we resulted in 'content': '[True]' Maybe you can try to respond 'content': '{success: True'} like here.

luzik commented 9 months ago

This did not help, but:

Why do we need to inform model about success ? (and wasting tokens)
I reported miss compatibility of function calling with new model I am using and I believe that this will be fixed soon https://github.com/abetlen/llama-cpp-python/issues/1061 .

jekalmin commented 8 months ago

Why do we need to inform model about success ? (and wasting tokens)

We can't get response message and function call at the same time. We should either call API again to get response message or give up response message.

OperKH commented 8 months ago

I'm using LocalAI and this integration works with models functionary-7b-v1.4 and luna-ai-llama2-uncensored but with some models e.g. mistral-7b-openorca I got error:

function 'None' does not exist
Traceback (most recent call last):
  File "/config/custom_components/extended_openai_conversation/__init__.py", line 187, in async_process
    response = await self.query(user_input, messages, exposed_entities, 0)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/extended_openai_conversation/__init__.py", line 312, in query
    message = await self.execute_function_call(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/extended_openai_conversation/__init__.py", line 339, in execute_function_call
    raise FunctionNotFound(function_name)
custom_components.extended_openai_conversation.exceptions.FunctionNotFound: function 'None' does not exist

It is model problem or API problem and how it can be fixed?

jekalmin commented 8 months ago

Maybe you can try dolphin-2.7-mixtral-8x7b as Anto mentioned.

Since I haven't tried LocalAI much, I also need to try those. (I failed to get it work)

Anto79-ops commented 8 months ago

@OperKH yes, this model works https://huggingface.co/TheBloke/dolphin-2.7-mixtral-8x7b-GGUF BUT it cannot perform fucntion services.

Have you tried the functionary v2 model? I cannot get a template for the model to work with LocalAI. Supposedely this handles functions/tool better:

https://github.com/mudler/LocalAI/discussions/1641

neowisard commented 7 months ago

I'm using LocalAI and this integration works with models functionary-7b-v1.4 and luna-ai-llama2-uncensored but with some models e.g. mistral-7b-openorca I got error:

@OperKH did you get to do any of the functions \ tools? Or was it just communication/answers?

Anto79-ops commented 7 months ago

It does function only. Does not chat well, if at all if I remember correctly.

jekalmin / extended_openai_conversation

Local model via llama-cpp-python support #72