Tomiinek / Multilingual_Text_to_Speech

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
MIT License
826 stars 157 forks source link

Adding support for windows sapi5 or android #69

Closed king-dahmanus closed 2 years ago

king-dahmanus commented 2 years ago

Hello developer. I'm here to request something to be done to this awesome tts of yours. Please make a windows sapi5 release that follows the following criteria:

  1. Responsive: doesn't have any delay before starting the speech and doesn't lag.
  2. When sped up, doesn't produce weird artifacts that mess up the quality of the speech. If something like that can't be achieved with a neural network, please convert it to an hts based system insteadm with all of its voices, if possible. What is sapi5, and why am I making these conditions? Well, I'm a blind person who uses a screen reader to know what's around my screen and navigate my OS and apps like everyone else does. Now, sapi5 is windows's speech system, you can make voices which are compatible with that speech application programming interface. You can however, make it as an addon for the NVDA(non visual desktop access) free and open sourced python made screen reader, if that's easier for you. I'm not a developer yet, so I'm just suggesting. I look forward to your replies. Thanks and regards Dahmanus. P.s, If you want to make it as an NVDA addon, please consider implimenting the possibility to mix different languages together, i.e when the tts finds a latin text it reads it with the voice you specified, same case for nonlatin text.
king-dahmanus commented 2 years ago

Hello. Please reply to this issue as soon as you can. Thanks and have a good day.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Tomiinek commented 2 years ago

Hello,

I am very sorry, but this repository focuses on research and shows a novel approach to code-switched or multi-lingual TTS. The shared models are far away from production use.

I do not understand the things such as windows sapi5 or NVDA addons and I do not have capacity to work on that.

Sorry and sorry for the late response Have a nice day.

king-dahmanus commented 2 years ago

so it's undoable? Sapi5 is windows speech application programming interface or speech api v5. An nvda addon is an extention of the free opensourced python made screen reader called nvda to integrate these voices into the screen reader. Could we collaborate(I'm the tester and you're the coder) so we can work on it? Or at least when the project advances a little?