Closed hernanjls closed 1 month ago
Hi.
Does the 404 (Not Found) error for the /gtts/
request originate from the nginx server, the Flask server, or the Google TTS server? - You can probably determine this by checking the Network tab in your browser's developer tools, where you can examine the actual request and response headers, among other details.
If the 404 error is coming from either nginx or Flask, it might be due to some configuration issue. Unfortunately, I'm not familiar with nginx or Flask, so I can't really say what that issue might be. In principle, your configuration seems alright to me.
If the 404 error is from the Google TTS server, here are a couple of observations. First, you should not send the Authorization header with a JWT to the Google TTS server. It is sufficient to include your Google TTS API key in the URL. The JWT should only be used to verify that the end-user is authorized to use your Google TTS API key for the request, based on the information specified in the token (such as username, expiration time, etc.).
In your token
method, it seems you return a JSON string. However, according to the standard, a JSON Web Token consists of three Base64-URL encoded strings separated by dots. I'm not sure what you plan to do with your code eventually, but typically, once the user is identified via SSO, you would return a JWT that grants the user restricted and time-limited rights to make actual API calls. For more information about JWT, refer to jwt.io
thanks for answering, I like your component and I will keep trying to make it work in my work environment, since it is in pure javascript without using react or other libs, it would be useful for me to put it in any web client environment... I generally program in c# for mobile, but I do not regularly use programming languages on the server, I know a little about Flask because I have used it to learn a little about Gemini
The 404 error appears to be not working properly when trying to connect to the endpoint. but since I want to understand how your component works, I have been looking for how to implement the server part that you mention.
Would it be too much to ask if you can show the code associated with the rest service that you use as a proxy to implement the JWT? What language did you do it in? php with apache? or what is required to make the component work with gtts elevenlabs or other component? I understand that these libraries are required only to obtain the audio from a text or what else do you use it for?
You can read the outline of my own Apache2/JWT/SSO setup in the README, Appendix B.
For JWT specifically, I currently use the jwt-cli CLI tool along with shell scripts. However, there are many different JWT tools and libraries available for various programming languages (C#, PHP, Python, etc.). See https://jwt.io/libraries
I won't include my CGI scripts here due to security reasons. However, the general idea is that my get
JWT CGI script generates and encodes a new token using a CLI tool, then returns the token. For example (not a real token):
{ "jwt": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c" }
When the token—that three-part Base64-URL encoded string—is used in an API proxy request, the jwtverify
CGI script extracts it from the Authorization header and uses the CLI tool to decode it. If the token is valid, not expired, etc., the script allows the proxy pass. The information you include in the token and what you check are up to you.
Yes, Google TTS and ElevenLabs are text-to-speech services. In addition to audio, they provide word-to-audio timestamps, which are essential for accurate lip-sync. If you want to use JWT + ElevenLabs, there is an Apache2 configuration example for the ElevenLabs WebSocket API in Appendix B. You can also find a client-side code example of how to use the class with ElevenLabs TTS in the test app index.html
. See the methods jwtGet
and elevenSpeak
.
I'm curious, did you manage to make it work?
Hello, notice that I haven't taken it up again, it's just that I was looking for a component like this to make it work for an experiment I did on AI and I wanted to integrate it with a talking avatar, in the end what I did was use some examples with react, but the lipsync still I haven't been able to make it work, anyway I want to give me time to fully understand the use of your component that seems more robust to me, just give me a little time to examine it better, if you want you can see my experiment here...
https: //avatar.virtualisimo.net/,
I hope to be able to continue exploring your component in a few more days if you can help to me for that I aprecciate very much your help, I maybe I can help you to best this
Thanks for the update — I enjoyed your upbeat demo!
Regarding the demo, if you haven't noticed, I have a short code example called mp3.html
in the examples directory that can make an avatar lip-sync to any audio file. It uses OpenAI's Whisper to transcribe the audio and obtain word timestamps. The code then uses the TalkingHead's speakAudio
method instead of speakText
, which eliminates the need for any TTS engine. You can watch a related demo video here (the screen capture is from the test app index.html
, but the relevant code is more or less the same as in mp3.html
).
If your own audio analysis already provides word-level timestamps, integrating lip-sync would be quite straightforward without any TTS or API keys.
In your demo, the key highlight was, of course, the sentiment analysis, which I also found interesting. I've conducted some sentiment analysis experiments using GPT-4 with function calling, dynamically altering the avatar's mood or triggering animations—though only with text input/output. This was the first time I've seen it applied to a song.
I will close this issue, but feel free to reply if you have further questions.
Hi I am newbie in this library... I am trying to create the proxy for can usage the library in flask over nginx but I dont can work this....
the app is the follwing ....
the html/javascript is following main.html
the nginx content is:
I dont expert in flask but I think that the config file for nginx is ok? the output page is working and load the avatar
but when i try to do speak this happen.. in the console of the google chrome
please help me to fix the errors to can allow me to work with this library