ken107 / read-aloud

An awesome browser extension that reads aloud webpage content with one click
https://readaloud.app
MIT License
1.31k stars 227 forks source link

Support the ability to merge texts into a single chunk #376

Closed guitarino closed 3 months ago

guitarino commented 4 months ago

Hi @ken107! I highly appreciate your work on this plugin!

Some TTS engines, especially neural network-based such as Piper TTS, don't work extremely well when text is being split into many chunks. It takes a noticeable amount of time to begin speaking every chunk. Therefore, the preferred way of interacting with them is often by submitting multiple sentences and paragraphs at the same time in a single chunk, so that while the first sentence is spoken, they can generate next sentences and, once ready, append them to the speech stream

Given that this is a likely direction where text-to-speech is going, and because I'd really like this feature myself, I'd like to add a support for an option to merge texts into a single chunk

The current PR is adding an dropdown named "Merge or split text" that is saved per voice with the options to either

Here's the screenshots: Screenshot from 2024-02-11 00-35-19 Screenshot from 2024-02-11 00-35-35

Now, there is alternative and possibly better solution to the problem. Namely, we can have an input field for customizing how many paragraphs can be in a single chunk, as well as an option to make it infinite (e.g. if "-1" is specified). I'm totally willing to implement that if you think that's better!

Additionally, I'm not confident that it's wise to support this customization on anything other than native TTS engines. If you'd like me to restrict it to only native TTS engines, let me know and I'm happy to give it a go (some minor pointers would be appreciated here)

Thanks again for making and supporting this extension and for taking a look at this PR!