Open sitatec opened 4 months ago
That should work. Actually not a bad idea to support Google Could TTS as an engine.
You need to put the audio chunks into self.queue like in all other engines. Make sure get_stream_info method returns the information needed about format, number of audio channels and sample rate. Look at gtts_engine.py and system_engine.py, these are quite simple engines that you can take as example.
Ask me if you run into probs. PR would be huge if it's finished ;)
I will try to implement it and open a PR. I have one more question: I tried before generating 3 sentences separately in Google Cloud Console and download the audios, then merge them. But It was noticeable that they were merged, it wasn't as smooth as if I generated the 3 sentences together. Is your library handling this case to make is sound natural (not concatenated audios)?
That depends on what the reason is it does not sound natural. Everything you want to do make it sound "more natural" should be done in synthesize method. For CoquiEngine for example there is silence added between the detected sentences to make it sound more natural.
Hi @KoljaB, thanks for this amazing repo. Great work 👍🏾, really!
I would like to know if I create an engine and implement the
BaseEngine
methods and simply generate the audio for every text given to the synthesize method, if it would work?My goal is to get LLM output chunks and use Google Cloud TTS API (Not the G-Translate) to generate audio in real-time.