Closed phatneglo closed 4 months ago
Hi @phatneglo.
Thankyou for the insightful description of your issue.
Unfortunately textToSpeech does not work for html
responses as there is no standardized way to determine which part of the markup is the actual text that needs to be spoken out. To get this to work for your case you will have to fork/clone the project and tailor the codebase to your custom html
responses. It is actually not that hard to do and the instructions how to get started are listed here.
Another small thing that I noticed is that you are using the addMessage
method which has been deprecated. I would instead advise you to set the websocket
property to true which will allow you to reply with multiple messages (you do not need to have an actual websocket connection). In addition to this, our recent dev
version actually supports using websockets to act as streams - which introduces a stop word to the stream
property to indicate that a message has finished streaming. This would work perfectly with your setup, just a few small things will need to be switched around for the websocket infrastructure (check the websocket example in handler documentation). Find out how to use this here.
Just wanted to add that I am very much on the fence for returning addMessage
method due to how many people seem to want to use it. So it may return in the near future. Currently you can actually still use it by simply calling the _addMessage
method.
@OvidijusParsiunas yeah i notice that too, when i updated the package, i notice addMessage is gone, i found it on _addMessage,
also another issue.
If you notice that the TTS voice continues speaking for a long time, especially with lengthy responses, you can adjust the behavior by using the "stop" or "mute" commands. When the TTS voice is activated, simply say "stop" or "mute" to prevent it from continuing or to pause it temporarily. This will allow you to proceed to the next interaction without the TTS voice continuing to speak. If you encounter any issues with this functionality, please let me know so I can ensure that it is properly addressed.
again thanks for your support! i wish i could help you back when im all done with my project.
Hey, adding text-to-speech configuration for options such as start/stop/resume is a little tricky to do for a couple or reasons; the first one being that this would be a significant UX change that requires a different message layout to create room for these options (this is one of the reasons why the ChatGPT app has an entirely different chat experience when using STT and TTS), and the second obstacle is that something like this would take a considerably long time to develop.
Therefore, due to this functionality having lower priority than our other upcoming work - I will unfortunately have to say that I currently will not be able to pursue this, but perhaps I can revisit it in the future.
Thankyou for the suggestion!
No problem bro! again thanks for your help, I'll close this now!
Hey there!
I've been playing around with deep-chat
for a bit and really love what you've built. It's been super helpful for adding chat functionalities to my Vue projects. 🚀
While integrating it, I thought it'd be awesome to have a bit more control over the speech synthesis, especially being able to mute or stop the assistant's speech on the fly. I figured this could really amp up the user experience, letting users quickly pause the assistant whenever needed.
So, I tinkered around and came up with a neat little enhancement that does just that. Here's the gist of what I did:
Keeping an Eye on Speech: I used Vue's ref
to set up a reactive property called isSpeaking
. It keeps track of whether the speech synthesis is doing its thing.
Getting in the Middle: I went ahead and tweaked the window.speechSynthesis.speak
method a bit. This way, I could listen in on when speech starts and stops, updating isSpeaking
to reflect the current state.
Quick Mute Button: I popped in a Quasar Floating Action Button (FAB) that shows up only when the assistant is chatting away. A quick tap on this button, and voilà , the assistant takes a breather.
Here's how I pieced it together:
const isSpeaking = ref(false);
const cancelSpeech = () => {
window.speechSynthesis.cancel();
isSpeaking.value = false; // Let's keep things updated
};
onMounted(() => {
const originalSpeak = window.speechSynthesis.speak.bind(window.speechSynthesis);
window.speechSynthesis.speak = (utterance) => {
utterance.addEventListener("start", () => {
isSpeaking.value = true;
console.log("We're talking!");
});
utterance.addEventListener("end", () => {
isSpeaking.value = false;
console.log("All done talking.");
});
originalSpeak(utterance);
};
});
And in the template:
<q-btn v-if="isSpeaking" fab icon="fas fa-volume-mute" color="teal" @click="cancelSpeech" class="fixed-bottom-right" />
Plus a little styling to keep the button snug in the corner:
.fixed-bottom-right {
position: fixed;
right: 20px;
bottom: 20px;
z-index: 999;
}
I thought this might be a handy feature for deep-chat
! It's been a game changer for me, and I reckon it could be for others too. Maybe it's something that could be baked into the next version? Just a thought! 😄
Keep up the great work!
Hi @phatneglo.
By default Deep Chat uses the speechSynthesis window property to facilitate the text to speech functionality, so the way you tapped into its functionality from Vue is very clever!
As I mentioned, one of the bigger hurdles of facilitating this functionality from within Deep Chat is the UX as slotting it into the chat in a clean manner is not simple. Ofcourse, then there is the development time that comes along with it. For now, I hope developers can look at your code and build extensible text to speech experiences that way.
Thankyou very much once again @phatneglo!
Huge shoutout to the folks behind this repo! 🙌 Seriously, you guys have saved me more times than I can count. Everything from the docs to the code is just top-notch and super easy to dive into. It's been a game-changer for my projects, and I've learned a ton along the way.
Big thanks for all the hard work you've put in. You've made something awesome that's not just helpful but also super inspiring.
Environment:
Issue Description: In our Vue 3 chat application, we use streaming to implement text-to-speech (TTS) functionality for chat messages. The TTS works as expected for initial messages. However, when we try to introduce follow-up questions by uncommenting the logic that adds extra HTML content for user selection, the TTS stops working. It seems the TTS functionality fails to read the reply once new HTML content is introduced into the chat.
Steps to Reproduce:
Expected Behavior: The text-to-speech should continue to work and read out the chat messages even after adding follow-up questions with HTML content.
Actual Behavior: After adding follow-up questions with HTML content to the chat, the text-to-speech functionality stops working. It appears that the TTS can only read the initial reply from the stream, and fails to process the newly added messages.
Troubleshooting Done: