Function spec example for say_tts

jekalmin / extended_openai_conversation

Home Assistant custom component of conversation agent. It uses OpenAI to control your devices.

834 stars 108 forks source link

Function spec example for say_tts #176

Closed danielp370 closed 3 months ago

danielp370 commented 3 months ago

A flexible say tts example. This allows the LLM to pass in any tts entity so it can direct the message to where it is needed.

Scags104 commented 3 months ago

Question on this, and i know its not its intended purpose but could this be modified and utilized in a way that my input device and output device were separate entities?

Example: Ask home assistant satellite with no external speaker a question Home assistant pipeline using extended openai processes Function always called to export response to specific speaker based on area of input satellite

Let me know your thoughts

TGrounds commented 3 months ago

Question on this, and i know its not its intended purpose but could this be modified and utilized in a way that my input device and output device were separate entities?

Example: Ask home assistant satellite with no external speaker a question Home assistant pipeline using extended openai processes Function always called to export response to specific speaker based on area of input satellite

Let me know your thoughts

I'm looking for a similar solution.

jekalmin commented 3 months ago

If this PR is merged, you can use input device_id in service template. In this way, it seems you will be able to conditionally select a output speaker.

Scags104 commented 3 months ago

If this PR is merged, you can use input device_id in service template. In this way, it seems you will be able to conditionally select a output speaker.

sure but how would i grab the response back that open AI is sending?

jekalmin commented 3 months ago

Currently, as far as I know, Assist pipeline doesn't support respond to another speaker. Although I haven't tried, this will be a workaround that it pretends to be output speaker but it isn't in reality. Hence, the response will be lost.

rfam13 commented 3 months ago

Check out stream assist integration, I am using it with this. It creates a pipeline from any camera entity or rtsp stream, and allows you to select the output speaker. That is the basic function. However I wanted my Alexa to be the output but that does not work with TTS service streamassist calls on the output device, so I had to install Alexa Media Player addon, and use the custom command for Alexa to use "Simon says , for this to work I have an automation that is triggered when the pipeline (stream assist) says text is detected from the STT entity, if that is true then parse the text from the TTS entity with templating and send to whatever device you want, in my case Alexa. That sounds more confusing than it is lol.