Open tangfucius opened 1 month ago
https://huggingface.co/alvanlii/whisper-small-cantonese
No, this is just an experiment and also to see if it solves other users' problems, you should continue to use our hon9kon9nizer repository as it is newer and more promising.
No... it doesn't support one-time voice cloning, the example you provided is from a different person. Company (? And it's paid and doesn't seem to be open source.
Thanks for the link! Will try that out to save time.
This seems to be the person behind cantonese.ai, and the TTS samples he provided sounded pretty decent, but he never released his models. Just wondering if you are aware of his work.
Hi! I am from HK and just started learning about Cantonese TTS. My first goal is to train it on 林尚義's voice. I am starting with this repo as suggested in this repo's README, but that repo doesn't have a section for issues, so I am asking here instead.
I managed to get inference working in
webui.py
by following the instructions in this PR, and now I want to train a new speaker. I have a few questions:Is the process to train a new speaker the one described in
webui_preprocess.py
? i.e. I need to prepare aesd.list
with the following format:****.wav|{说话人名}|{语言 ID}|{标签文本}
Audio is easy to find, but is the annotated text needed too? Preparing the text would be quite time consuming if there is a lot of audio data - are there utils that can help with that?Based on your comment, should we work on the
Style-Bert-VITS2
branch instead? I can also contribute to better docs if I get things going.Does the framework support one-shot voice cloning, as claimed in cantonese.ai (unfortunately the web demo isn't available)? I assume not, but would like to confirm.
Thanks for your work in making Cantonese TTS open source! Hope I can contribute to this initiative going forward!