valentinfrlch / ha-llmvision

Let Home Assistant see!
Apache License 2.0
187 stars 10 forks source link

Assist Intents #93

Open JosephAbbey opened 5 days ago

JosephAbbey commented 5 days ago

The Extended Open AI conversation integration works really well, but the true HomeAssistant way is to use an Intent. This allows any conversation agent (Assist or any AI) to use the features.

[!WARNING] Creating custom intents in a custom_component is still slightly annoying, as users have to manually copy the custom_sentences file (which defines how the default conversation agent recognises the intents).

This does not apply to AI conversation agents.

[!TIP] I have some example custom intents: JosephAbbey/ha_custom_sentences

Ask about events

The Intent

This is easy, it just takes a start and end time as input and returns the calendar events, in fact I have an intent in my examples that does just this, however it is optimised for future events and relative time, so a dedicated tool would be very useful.

The Sentences

This would have to be an AI-only intent as there is not standard format for queries.

Ask about the current state

The Intent

The general format of the intent is quite strait forward:

However, the specifics are interesting;

The best way I can see of allowing a user to specify a provider in the request is to create some sort of vision.* entity, which stores the provider, model, and all of the configuration. Then the intent accepts the vision entity as input (this also benefits the yaml mode for service calls as an entity id can be specified instead of a provider id). That however is a large breaking change and restructure for the project.

The other option is just to have global configuration options for the intent.

The Sentences

I think that the general format of the sentence will be:

Who is on the door bell camera?

Where we match the phrase on the and the camera entity door bell camera. Then either the whole sentence is used as the prompt or just the Who is, I prefer the former.

I don't have a good way for built-in intent recognition to process a sentence like:

Who is at the door?

This is easy with an AI agent as the AI can recognise that the user has a door bell camera and call the intent as required.

Happy to help

I have written a bunch of intents before and am happy to help implement this if you would like to add this to your project.

valentinfrlch commented 4 days ago

Thanks a lot, this sounds like an amazing addition! I don't have any experience writing intents for HA Assist and have limited time atm. But if you'd like to send a PR, I'd be happy to review and help if you have any questions with llm vision.

Looking forward to collaborating!