jekalmin / extended_openai_conversation

Home Assistant custom component of conversation agent. It uses OpenAI to control your devices.
834 stars 108 forks source link

Another prospective #156

Open pajeronda opened 4 months ago

pajeronda commented 4 months ago

Hi @jekalmin I saw in "pull requests" that in a discussion you were looking for a solution to the "big prompt", to reduce prompt tokens which sends the exposed entities with each call.

I propose my idea. Analyzing the current configuration of your component, it seems to me that all user-created functions depend on the same prompt. Hierarchically, the prompt is the pivot of the requests (correct me if I'm wrong).

Instead, I suggest considering another approach: placing the functions at the forefront, with a customized prompt attached to each. This change in structure allows for the creation of customized prompts to be attached to functions. It would open up a world of possibilities (as well as savings :D ).

From: prompt → function A → function B → function C To: function A → prompt A function B→ prompt B function C→ Prompt C function D→ prompt A (same prompt)

I tried to understand the code and saw that, a custom prompt is already designed with CONF_PROMPT

def _generate_system_message( self, exposed_entities, user_input: conversation.ConversationInput ): raw_prompt = self.entry.options.get(CONF_PROMPT, DEFAULT_PROMPT) prompt = self._async_generate_prompt(raw_prompt, exposed_entities, user_input) return {"role": "system", "content": prompt}

So only a custom prompt selection should be implemented.

Some examples: `

`

→ Alternatively, on the configuration page, a dropdown menu is used to select the prompt to be used for each function.

This new approach, or perspective, opens up the possibility of implementing a more independent assistant, less tied to the sole automation management of Home Assistant. The possibility of using assist also on smartphones or smartwatches is an opportunity to be exploited.

If you appreciate this new approach, also consider adding a folder in /config that allows you to save prompt and function configurations in YAML format files.

pajeronda commented 4 months ago

Another possible implementation that could be added is model selection, again with a drop-down menu. Today I discovered that gpt-4-turbo-preview didn't interpret the prompt correctly, while gpt-3.5-turbo did.

jekalmin commented 4 months ago

Thanks for your suggestion!

I added a comment to your ideas. Please correct me for any misunderstanding.

placing the functions at the forefront, with a customized prompt attached to each

As far as I understand, your suggestion is a great fit for people who create each voice assistant for only one purpose (registering one function for each voice assistant). However, it would not be able to cover people who want to ask anything to just one voice assistant (registering many functions for one voice assistant).

Some functions are tied to the prompt such as area (you need to add area_id to execute_services to use area awareness), but some functions are not.

Another possible implementation that could be added is model selection, again with a drop-down menu

I also experienced that model name is likely to be mistyped, but it's also flexible that make it possible to use any model names if used in any OpenAI compatible servers.

pajeronda commented 4 months ago

I added a comment to your ideas. Please correct me for any misunderstanding.

placing the functions at the forefront, with a customized prompt attached to each

I'll try to answer you with an example: if you use scrape or rest in your functions, in 99.9% of cases you don't need to send all the exposed devices to the prompt, because you are taking the data from an external link. But in the current configuration you cannot do otherwise and you will also send exposed devices, which will increase the number of tokens unnecessarily. If, however, you can choose the prompt from some that you have previously set, all you have to do is choose the prompt to start together with the function.

As far as I understand, your suggestion is a great fit for people who create each voice assistant for only one purpose (registering one function for each voice assistant).

No, it's not about selecting your home assistant conversational agent. My proposal is to change the way this component's prompt template is sent. Create multiple prompts and attach one based on the function.

This is the prompt -> create more, as needed. image

These are the functions -> attach the appropriate prompt to each function. image

About the model: Make a selection based on function. As I wrote to you, on some custom functions I encountered errors with gpt-4, but not with gpt-3.5. Being able to select the model based on function would be useful. image

To recap: the (spec) function determines the prompt and model (gpt-4; gpt-3.5, etc...).

pajeronda commented 4 months ago

It is likely that the modification I have proposed will appear to you as a distortion of the project and, therefore, difficult to implement. But you could make a much simpler change: include in the instructions that are sent via the "-spec" function an additional parameter that enables the sending of exposed devices, such as "exposed=true/false" and in the configuration page put a 'another box where you can insert the template code, which at this point will only be added if it is "expose=true". As default condition you can set "true"

example in strings.json:

   "options": {
    "step": {
      "init": {
        "data": {
          "prompt": "Prompt Template",
          "expose": "Exposed Entities",  <--- this 
          "model": "Completion Model",
          "max_tokens": "Maximum tokens to return in response",
          "temperature": "Temperature",
          "top_p": "Top P",
          "max_function_calls_per_conversation": "Maximum function calls per conversation",
          "functions": "Functions",
          "attach_username": "Attach Username to Message",
          "use_tools": "Use Tools",
          "context_threshold": "Context Threshold",
          "context_truncate_strategy": "Context truncation strategy when exceeded threshold"
        }
      }
    }
  },

On the configuration page, put the instruction in the new text field Expose:

Available Devices:
entity_id,name,state,aliases
{% for entity in exposed_entities -%}
{{ entity.entity_id }},{{ entity.name }},{{ entity.state }},{{entity.aliases | join('/')}}
{% endfor -%}

example in functions:

 - spec:
    name: execute_services
    description: Use this function to execute service of devices in Home Assistant.
    exposed: true <--- this 
    parameters:
      type: object
      properties:
        list:
          type: array
          items:
            type: object
            properties:
              domain: ...

I would like to do it myself but I don't have the Python skills to make it happen. Thank you

jekalmin commented 4 months ago

Thanks for your explanation!

you can choose the prompt from some that you have previously set, all you have to do is choose the prompt to start together with the function.

Since you can register single prompt and multiple functions, there might be a way needed to make it into one prompt.

It is likely that the modification I have proposed will appear to you as a distortion of the project and, therefore, difficult to implement

Suggestions are welcome. I really appreciate it! If necessary, it's totally fine to distort the code.

But you could make a much simpler change: include in the instructions that are sent via the "-spec" function an additional parameter that enables the sending of exposed devices, such as "exposed=true/false" and in the configuration page put a 'another box where you can insert the template code, which at this point will only be added if it is "expose=true". As default condition you can set "true"

I understand that exposed option (and selecting models) will make us convenient in controlling whether to include or exclude entities in a prompt.

However, if another template box (for exposed parameter) is created, it needs a way to merge two template boxes (current one and new one for exposed: true) into a prompt.

I think it could also make users confusing if there are many things this integration does under the hood.

Currently, focusing more on functionality than usability is what this project is pursuing at this time. Probably focus may be changed when this project is matured.