met4citizen / TalkingHead

Talking Head (3D): A JavaScript class for real-time lip-sync using Ready Player Me full-body 3D avatars.
MIT License
349 stars 107 forks source link

Google Translate tts #70

Open tomyslavs opened 1 week ago

tomyslavs commented 1 week ago

Is it possible to use Google Translate tts endpoint instead of

ttsEndpoint: "https://eu-texttospeech.googleapis.com/v1beta1/text:synthesize",

met4citizen commented 1 week ago

No, only the official Google Cloud Text-to-Speech API endpoints and compatible API proxies are supported. That said, the endpoint is only used in the speakText method. If you use speakAudio method instead, you can generate your audio using any TTS engine you like in your app. A common caveat here is that not all TTS engines provide word-level timestamps, which are needed for accurate lip-sync. I haven't checked, but I'm pretty sure that Google Translate TTS can't provide them.