snakers4 / silero-models

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Other
5k stars 316 forks source link

Feature request - Add SSML support to Telegram TTS bot #178

Closed baterflyrity closed 2 years ago

baterflyrity commented 2 years ago

Greeting. Somehow I was lucky to find out your TTS bot in Telegram (https://t.me/silero_voice_bot). It it very cool and it is cool2 to support RU language.

🚀 Feature

Unfortunately, the bot is quite wrong in punctuation and pronunciation, especially emphasis. Hence It would be nice to support SSML or any other emphasis language.

Motivation

Pitch

Consider adding some markup pairwise symbol with respect of symbols limit of 500. According to SSML can suggest equivalent example for the bot:

Когда я просыпаюсь, _я говорю довольно медленно_.
Потом я начинаю говорить своим обычным голосом,
@а могу говорить тоном выше@
или $наоборот, ниже$.
Потом, если повезет – ^я могу говорить и довольно быстро^.

As for now the bot generates opposite results:

Когда я просыпаюсь, я говорю довольно медленно. | speaks fast Потом я начинаю говорить своим обычным голосом, @а могу говорить тоном выше@ | nothing happens или $наоборот, ниже$. | nothing happens Потом, если повезет – ^я могу говорить и довольно быстро^. | speaks a bit slow

Alternatives

Otherwise SSML can be supported but this will require symbols limit increase.

snakers4 commented 2 years ago

Hi,

Unfortunately, the bot is quite wrong in punctuation and pronunciation, especially emphasis.

This was not the bot's goal to provide all of the knobs. SSML is not supported in the bot by design. Moreover judging by the bots' main target audience and the usage of different features, fine-tuned controls, especially with less controllable voices maybe just too much.

baterflyrity commented 2 years ago

Thanks for response. Thus can warcraft model with presented voices be downloaded somewhere? Alternatively, Is there source code of the bot to make PR?

Moreover, can it be trained. If not is training in plans? Maybe server side training with user custom dataset...

snakers4 commented 2 years ago

No, no, no.