Open sivertheisholt opened 4 months ago
This is honestly super uncanny and scary but how does the format work? is it a file? is it streamed audio?
This is honestly super uncanny and scary but how does the format work? is it a file? is it streamed audio?
From what I have tried you can upload an audio file with 10-15 seconds of talking. It is scary accurate...
This is honestly super uncanny and scary but how does the format work? is it a file? is it streamed audio?
From what I have tried you can upload an audio file with 10-15 seconds of talking. It is scary accurate...
The google funding must have been crazy but hell no I am not uploading any audio ðŸ˜
So, I cloned the voice of Furina (a character from Genshin Impact) using this audio sample: https://i.imgur.com/Wb9XuG3.mp4
I tried c.ai and play.ht for comparison. The character AI cloning was way faster and sometimes sounded identical to the real one.
Play.ht cloning is really good but sometimes it glitches the voice.
I think it would be a cool feature to use the cloned voices in the module.
Will look into it
Ok so I have no idea what the format of the data is, but the domain is used in joinOrCreateSession to create an rtc session.
Now i'd actually be surprised if it REALLY opened a webrtc session to do this, but i'll investigate more
for fetching the voices it should be easy, they have a domain for it.
Any update brother about voice system?
Hello,
Not really. I haven't looked for it since but if I have interest and this is a feature a lot of developers would like to see on the package feel free to let me know.
Cheers
Hello,
Not really. I haven't looked for it since but if I have interest and this is a feature a lot of developers would like to see on the package feel free to let me know.
Cheers
I think this feature will be the most demanding feature as I am also a developer and I need this feature so hardly so I request you to work on it please
I also request that you to work on it. I'm an animatronic developer, and using premade Character AI TTS would save me a lot of time wrangling Mozilla TTS and installing a custom dataset on it.
After investigation, implementing this feature would mean to rework the entire endpoints to switch to the new (neo) endpoints and switch to a websocket etc.
After investigation, implementing this feature would mean to rework the entire endpoints to switch to the new (neo) endpoints and switch to a websocket etc.
So you will work on it or not brother or you need help in coding part?
I think I might have to consider both. I am considering rewriting and upgrading to the new API. Will let you know, I am very busy right now.
Cheers :)
Omg, didn't expect it using livekit. It likes we are really chatting with someone :D Maybe you have some clues to share to make it works?
Maybe some of us can help.
However switching to the new endpoints require a total rewrite of the client (which I started doing to support all the new endpoints) but this is much more work needed than I thought & my attempts at handling the websocket part confused me a lot
I have foundout that Character.AI is using EdgeTTS LOL
Any updates?
Check out #180. I will get to it soon, I am currently busy working on another project.
Hey there, sorry for the lack of updates.
Could using the livekit client be a great idea to handle the actual TTS part? Also, upon investigating the way it authorizes access to the actual livekit session is weird.
Hey there, sorry for the lack of updates.
Could using the livekit client be a great idea to handle the actual TTS part? Also, upon investigating the way it authorizes access to the actual livekit session is weird.
As for my information it's not using livekit anymore they are using there own system as of now they are fetching Mp3 file from there server.
They have added a new TTS system, where you can create custom voices or use other voices/predefined voices. Would be cool if we could implement this. The current implementation is only able to fetch and use the old voices.
https://blog.character.ai/character-voice-for-everyone/
Any thoughts on this? I will take a look at it and see how it works.