Closed christo-zero-john closed 1 month ago
Yes, you can use the speakAudio
method instead of speakText
. However, in addition to the audio, you'll need to provide the words and their timestamps for accurate lip-sync. One way to obtain these is by using some transcription service. For an example, refer to the mp3.html
app in the examples directory.
Yes, you can use the
speakAudio
method instead ofspeakText
. However, in addition to the audio, you'll need to provide the words and their timestamps for accurate lip-sync. One way to obtain these is by using some transcription service. For an example, refer to themp3.html
app in the examples directory.
Really! Thanks for the help and this library tooo. It was a big help for my project
Hi So can I use it without google tts api key. I want to use a hugging face model to convert text to speech and animate the face accordingly. How should I do that. I am using facebook/fastspeech2-en-ljspeech
model via hf inference api
If you only use the speakAudio
method, you don't need a Google TTS API key. The Google TTS API is only required when using the speakText
method.
If the Hugging Face TTS service you are using provides word timestamps, you can simply use the speakAudio
method. If it doesn't, you can either switch to a TTS service that does (such as Google, Microsoft, ElevenLabs, etc.) or use some transcription service (like OpenAI's Whisper) to extract word timestamps from the audio.
Okay. Thanks for the help.
Is there any way to use custom audio (For example auio in my system) instead of using google tts or others