CorentinJ / Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time
Other
51.55k stars 8.64k forks source link

Best real time cloning? #1165

Open WYNNGATE opened 1 year ago

WYNNGATE commented 1 year ago

Hi, I'm looking for the best voice changer, in other words, my speech to a cloned voice.

Use case is for youtube and other VO

gabrielmontagne commented 1 year ago

This commercial solution is quite good, I've tried it a lot and it's great. It's still not fully open, but you can request a test drive, https://www.resemble.ai/speech-to-speech/

Lolagatorade commented 1 year ago

Really sucks there is AI for everything you can run in local hardware. AI art stable diffusion, text GPT you got alpaca and llama. But nothing good for voice cloning.

tdlio commented 1 year ago

ElevenLabs in terms of quality has a really effective voice cloning in my opinion. Does anyone have a guess / know what their training protocol may have been? So the base model, and then what else they added to it to bring it to where it is today. Breaking it down is the first step to making an open source alternative which I’m very interested in doing!

WYNNGATE commented 1 year ago

Not real time but surprisingly great,

https://play.google.com/store/apps/details?id=com.voicecopy.app

On Thu, Mar 23, 2023, 6:57 AM Jack Stones @.***> wrote:

ElevenLabs in terms of quality has a really effective voice cloning in my opinion. Does anyone have a guess / know what their training protocol may have been? So the base model, and then what else they added to it to bring it to where it is today. Breaking it down is the first step to making an open source alternative which I’m very interested in doing!

— Reply to this email directly, view it on GitHub https://github.com/CorentinJ/Real-Time-Voice-Cloning/issues/1165#issuecomment-1481148209, or unsubscribe https://github.com/notifications/unsubscribe-auth/A56URW2OAGGVUALZ4Q77YR3W5RCCFANCNFSM6AAAAAAVCHA7AU . You are receiving this because you authored the thread.Message ID: @.***>

Lolagatorade commented 1 year ago

ElevenLabs in terms of quality has a really effective voice cloning in my opinion. Does anyone have a guess / know what their training protocol may have been? So the base model, and then what else they added to it to bring it to where it is today. Breaking it down is the first step to making an open source alternative which I’m very interested in doing!

Honestly, there's not much things that are open in terms of Voice cloning. You can go on GitHub and type in voice, cloning and search for whatever comes up I believe some of those results have research papers. I remember there was some Chinese repository that has it running locally but of course I don't know Chinese. I just find it very strange how open image generation face swap, and all the other things are, but there's so many companies that are private when it comes to Voice cloning

WYNNGATE commented 1 year ago

Indeed! Yes, very strange. I suspect that ai singing will take over soon. Sigh...it will be great to hear such perfect singing voices..but at what price? The further destruction of culture etc.. I will say, I was very impressed with the app https://play.google.com/store/apps/details?id=com.voicecopy.app

PLEASE let me know if you discover any quality real time solutions. Voicemod seems maybe okay, looks like it can be ran on a windows server and then controlled via Android app.. still, nothing too great.

On Thu, Mar 23, 2023, 1:10 PM Lolagatorade @.***> wrote:

ElevenLabs in terms of quality has a really effective voice cloning in my opinion. Does anyone have a guess / know what their training protocol may have been? So the base model, and then what else they added to it to bring it to where it is today. Breaking it down is the first step to making an open source alternative which I’m very interested in doing!

Honestly, there's not much things that are open in terms of Voice cloning. You can go on GitHub and type in voice, cloning and search for whatever comes up I believe some of those results have research papers. I remember there was some Chinese repository that has it running locally but of course I don't know Chinese. I just find it very strange how open image generation face swap, and all the other things are, but there's so many companies that are private when it comes to Voice cloning

— Reply to this email directly, view it on GitHub https://github.com/CorentinJ/Real-Time-Voice-Cloning/issues/1165#issuecomment-1481752247, or unsubscribe https://github.com/notifications/unsubscribe-auth/A56URW75TIBARCUGIBRTRCLW5SNYPANCNFSM6AAAAAAVCHA7AU . You are receiving this because you authored the thread.Message ID: @.***>