jekalmin / extended_openai_conversation

Home Assistant custom component of conversation agent. It uses OpenAI to control your devices.
888 stars 120 forks source link

[Question] Would it be possible to integrate an option to use OpenAI's new Assistants and Threads API? #30

Open cl0ud6uru opened 9 months ago

cl0ud6uru commented 9 months ago

An option to integrate OpenAI's new Assistants and Threads API would be handy as a way to get it to remember past conversions for folks using the wake word function of Assist in Home Assistant. As of now, when using wake words to communicate it does not retain the past conversation like the Assist Chat does, (Well current conversation I should say with Assist Chat). Maybe have a user adjustable settings to set how long to retain threads for (Day, week, etc.). This would probably require quite the overhaul so it's more of a general question at the moment.

jekalmin commented 9 months ago

Although I haven't used new Assistant and Threads API yet, it seems possible as long as OpenAI supports API of adding message to a Thread and function calling.

If new API reduces token count, I without doubt would go for a new API. However, to my best knowledge, token count is not reduced although we don't have to keep sending a message history to API.

So I think I would focus more on making current plugin stable rather than try out new API at this moment.

enzo2 commented 8 months ago

I am also wondering about the Assistants/Threads API. I would not be surprised if Nabu Casa makes it a paid feature in Home Assistant Cloud. An AI assistant that really gets to know the household--a classic science fiction idea becoming real.

As a further suggestion (but potentially expensive), I wonder if the 'retrieval' tool (https://platform.openai.com/docs/assistants/tools/knowledge-retrieval) could be used. For example, generate a text file of all entities, devices, services, scripts and automations, and upload it to the Thread. Delete and re-upload when anything changes. (According to the docs, "We plan to introduce other retrieval strategies to enable developers to choose a different tradeoff between retrieval quality and model usage cost.)

jekalmin commented 8 months ago

Oh, I haven't thought of uploading all entities, devices, services, scripts, and automations. In ideal case, it would solve some of problems we currently have such as calling a function which doesn't exist. Although it might take a while, I will look into it.

richardsorensson commented 8 months ago

The idea of using the Assistant API is exciting.

A thought on the topic, as entites do change states alot it may not be efficient to upload a list over and over.

But uploading information about home assistant and docs related to the API may be good starting point to teach the assistant various things about home assistant.

However I do believe function calling would be a good approach to fetch entities and their current states as needed, instead of working with a static list. Ofcourse there has to be an endpoint available in home assistant to fetch entities & states for this to work.

saya6k commented 8 months ago

I desperately want to have Threads API as an option. It would seriously improve how we communicate with Assistant regardless of token price if #17 is merged. It can be just option for someone who does care about token.

that1guy commented 8 months ago

100% agree. When you use Assist's chat UI it feels magical because OpenAI can learn and remember previous conversation and therefore has context. That magic is lost when interacting with OpenAI via something like Wyoming Satellite.

I'd highly recommend building this feature, and if it is expensive then put it behind a feature flag of some sort.

Danm72 commented 7 months ago

As another direction for this, I've been thinking about running the assistant based on events /triggers and on a schedule this would allow it to become more autonomous.

Right now it's user triggered and has a job to do, if it ran regularly and had context of the system and preferred operating parameters it could be proactive rather than reactive.

For example, if it ran every 10-15 mins and was responsible for checking:

cl0ud6uru commented 7 months ago

As another direction for this, I've been thinking about running the assistant based on events /triggers and on a schedule this would allow it to become more autonomous.

Right now it's user triggered and has a job to do, if it ran regularly and had context of the system and preferred operating parameters it could be proactive rather than reactive.

For example, if it ran every 10-15 mins and was responsible for checking:

  • Are all the lights in expected states

  • Are temperatures correct

  • Are all automations behaving as described, or can it suggest improvements - 'this room has met temperature but is still heating, maybe set temperature a degree lower'

  • Alerts and warnings - Water leak detection

  • User returned home behaviors

Assist can be called via automations and will use whatever pipeline you have configured. So if your pipeline includes Extended OpenAI conversation, then any time you call assist it will use it. I have a few Node red flows the give it Input, and uses the output however needed. Just wanted to throw that out there. Also, wanted to point out the cost of making OpenAI api calls every 15 min would definitely rack up.

Floris3 commented 7 months ago

Love this project!

I second that using openai assistants seems like the way forward. If information about entities is already known by the assistant, there would be no need to send the entire info on smart home over the API at every request by the user. The assistant could be trained to write a function to look up a state of an Entity and send it over the API in the same thread, and only then formulate an answer to the user. That way only the relevant information to answer a question by the user is send, thus making it more cost efficient. Would it not?

MindFreeze commented 7 months ago

I like sending the state each time, because look ups are slow and unreliable. Also multiple entity states may be needed to determine what to do. That said, static data like entity name, area and aliases definitely don't need to be sent every time