AlexxIT / StreamAssist

Home Assistant custom component that allows you to turn almost any camera and almost any speaker into a local voice assistant
MIT License
167 stars 13 forks source link

Continued Conversation with Context? #31

Open vash2695 opened 1 week ago

vash2695 commented 1 week ago

Loving this integration so far! I'm not sure if this is the right link in my particular chain, but I'm looking for some way of extending some of the capabilities of the voice assistant functionality and this seems like a good place to start!

For starters, here's what I'm using in my pipeline: Openwakeword > HASS cloud STT > OpenAI Extended > Elevenlabs TTS

This setup is fast(ish) and smart, but losing the thread on each wake is highly inefficient and limits functionality in many ways. Since we know that the chat interface can maintain a conversation within a single session, I'd like to know if it's possible to do the same with voice interaction. I think this would involve two parts:

  1. Add the option to bypass the wake word on subsequent interactions after an initial activation a. Stops if no text is detected or specific phrases like "Thank you" or "Nevermind" are used
  2. After interaction has stopped, the conversation thread should be kept open for a certain amount of time to maintain continuity and avoid the need to resend entity information if another activation takes place within that window a. Doesn't need to be long, maybe a minute or so after the last interaction ends

So, is there a way to implement this within StreamAssist or would this functionality need to be configured elsewhere? I'm by no means an experienced developer but I may know enough to help with figuring some of this out!