capacitor-community / text-to-speech

⚡️ Capacitor plugin for synthesizing speech from text.
MIT License
93 stars 24 forks source link

feat: Add support for word-level progress tracking in TextToSpeech #131

Open Dhruv-1105 opened 2 months ago

Dhruv-1105 commented 2 months ago

Is your feature request related to a problem? Please describe: In many applications that use text-to-speech (TTS), it is essential to track the progress of spoken words to provide features such as synchronized text highlighting. Currently, the @capacitor-community/text-to-speech package does not offer a way to get real-time updates on the specific words being spoken, which limits its utility in such scenarios.

Describe the solution you'd like: I propose adding support for an onRangeStart event that emits the start and end indices of the currently spoken word, along with the spoken word itself. This feature would allow developers to track which word is being spoken in real-time and implement functionalities such as synchronized text highlighting. The implementation involves the following changes:

Describe alternatives you've considered: An alternative approach could be to periodically poll the TTS engine for its current progress, but this would be less efficient and more complex to implement. Integrating directly with the UtteranceProgressListener provides a more reliable and accurate solution.

Additional context: This feature is critical for applications that need to provide synchronized text highlighting, karaoke-style text displays, or any other feature that requires real-time tracking of spoken words. Adding this capability to the @capacitor-community/text-to-speech package will significantly enhance its usability for a broader range of applications.

Dhruv-1105 commented 2 months ago

Please check the following PR for this issue: https://github.com/capacitor-community/text-to-speech/pull/132

bridgecode commented 1 day ago

Hello, I'm pretty new to Capacitor and not sure if this is the correct place to put this, but I tried to implement this feature in my Vue/Vite/Ionic/Capacitor app and I'm having trouble getting this to work. Not sure if this will only work on device but I was trying to use this in chrome so I could debug and get it working across all applications with this version. I would prefer to not have to use the Web SpeechSynthesisUtterance API in parallel to prevent different behavior between Web/Mobile. Is it possible to get it working in my local web environment or will this only work on a device, I can use an emulator but I prefer to have it working on the web as well

Thanks in advance for any info or if there's a code pen example I can see with a console log of the start/end/frame, and also thanks for the hard work getting this feature out I'm really excited for this! (please, lmk if I should put this in a separate issue

image