IAHispano / Applio

VITS-based Voice Conversion focused on simplicity, quality and performance.
https://applio.org
MIT License
1.29k stars 217 forks source link

Question: What kinda vits model does Applio use #471

Closed mantrakp04 closed 3 weeks ago

mantrakp04 commented 3 weeks ago

I was curious bout the architecture n the pipeline, if someone could summarize it for me id be grateful. I couldn't find what kinda vits model was being used

blaisewf commented 3 weeks ago

https://github.com/jaywalnut310/vits

mantrakp04 commented 3 weeks ago

Tnx, but I was specifically looking for the pretrained vits models, like meta-mms its based on vits with 1000+ langs, as I was confused why limit to only 30 languages where as there is the possibility to support 1000 languages

blaisewf commented 3 weeks ago

we don't use pretrained models from vits. what we mention about the languages in the readme is for the UI

mantrakp04 commented 3 weeks ago

hmm then which models does the ui use for base tts

blaisewf commented 3 weeks ago

applio is voice to voice

mantrakp04 commented 3 weeks ago

but you still need to have speech as input for voice to voice,

image
blaisewf commented 3 weeks ago

Edge TTS makes the audio and then we convert with Applio

mantrakp04 commented 3 weeks ago

Cool, tnx for letting me know