Open RASPIAUDIO opened 8 months ago
Thanks for a suggestion!
In order to achieve streaming response, everything from end to end should support streaming. Although OpenAI supports stream option, as far as I know, IntentResponse of HA need to support streaming as well, which is not supported.
Please correct me if anything should be fixed.
Perhaps one way to do it without any major change could be to split response by sentences (detection of '.' followed by a capital), each sentence could be sent one after the other in the IntentResponse. And in the prompt it should say to answer with short sentences.
Could it be done on your side?
it could really change the user experience
Although it's possible to split into multiple sentences, since only one IntentResponse
is used per conversation, I can't send multiple IntentResponse
to the next assist pipeline.
I have been looking for a way to speed up the reply times as well and stumbled over this issue. I do not have a solution, but would like to contribute with some brainstorming: 1) I do not think the average response time to actually do any action is problematic, so without knowing the ins and outs of HomeAssistant, the actual intent recognition could stay where it is. 2) What feels too slow is the time until the reply starts. Just imagining (not knowing) how HomeAssistant is built, I assume Extended OpenAI Conversation could not just do voice/text output repeatedly before listening again. I also assume that triggering TTS form outside of Wyoming would yield any result.
So would a request for HomeAssistant to allow for multiple TTS outputs before listening again be helpful?
If you're using GPT-4 and request the assistant to create a lengthy bedtime story for your children, you might experience a wait time of nearly a minute. This is because the system waits to complete the entire story before generating the text-to-speech (TTS). While this is a niche use case, implementing a streaming feature could save a few seconds per query. Multiplied by the number of daily questions and the user base, this improvement could thus save many human lives in terms of time saved. :)
Maybe the first step is to ask streaming feature for voice assistant.
https://community.home-assistant.io/t/streaming-feature-for-voice-assistant/678923
I am encountering the same issue regarding response time in my use case as TTS ways until all the text has finished. A stream would make the whole experience seamless. @RASPIAUDIO, @bblaha since the time of your reply, any luck?
I am using gpt4 in HA with the Raspiaudio Luxe speaker and long answers need 20-30s before behing played. Could it be possible to implement the streaming option of the OpenAi API? It is what missing to compete with the others commercial voice assistant.
But then I guess it will request too a modification of the TTS plugin too.