rsxdalv / tts-generation-webui

TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS)
https://rsxdalv.github.io/tts-generation-webui/
MIT License
1.46k stars 160 forks source link

Regarding the problem of SeamlessM4T translating and cloning timbre/spoken language, here are cases #289

Open curui opened 3 months ago

curui commented 3 months ago

Regarding using only one audio sample, you can speak multiple languages using the tone of the audio sample. In fact, what you use is: seamless You can refer to this: https://replicate.com/adirik/seamless-expressive He seems to have also quoted seamless: https://github.com/replicate/cog-seamlessexpressive demo:https://www.youtube.com/watch?v=lgL_rCF02Ng You can refer to the recently popular ones: https://github.com/RVC-Boss/GPT-SoVITS

rsxdalv commented 3 months ago

Is GPT-SoVITS a new replacement for RVC?

curui commented 3 months ago

https://replicate.com/adirik/seamless-expressive

You can upload an audio test, as if translated directly and clone the sound at the same time https://replicate.com/adirik/seamless-expressive

curui commented 3 months ago

He uses seamless :https://huggingface.co/facebook/seamless-expressive
屏幕截图 2024-03-15 060657 I don't know why you do this

curui commented 3 months ago

Is GPT-SoVITS a new replacement for RVC? GPT-SoVITS can also Too complicated to use 。 You only need to upload a piece of audio, and you can use SeamlessExpressive. However, the SeamlessExpressive model needs to be reviewed before it can be obtained. I don’t know what the difference is between it and SeamlessM4T.

rsxdalv commented 3 months ago

He uses seamless :https://huggingface.co/facebook/seamless-expressive 屏幕截图 2024-03-15 060657 I don't know why you do this

Ah now I remember this seamless-expressive, I got it confused with seamlessM4T. To be honest, I doubt that many people will be willing to fill out the form. GPT-SoVITS and SeamlessM4T can be done though.