Closed mantrakp04 closed 3 weeks ago
Tnx, but I was specifically looking for the pretrained vits models, like meta-mms its based on vits with 1000+ langs, as I was confused why limit to only 30 languages where as there is the possibility to support 1000 languages
we don't use pretrained models from vits. what we mention about the languages in the readme is for the UI
hmm then which models does the ui use for base tts
applio is voice to voice
but you still need to have speech as input for voice to voice,
Edge TTS makes the audio and then we convert with Applio
Cool, tnx for letting me know
I was curious bout the architecture n the pipeline, if someone could summarize it for me id be grateful. I couldn't find what kinda vits model was being used