Open zhaojingxin123 opened 2 months ago
Does anyone know why that is? Or is there a Chinese data set with experimental success? What methods do you use to phoneme Chinese texts?
I am sorry, I haven't trained a Chinese dataset, but I can assure that the model training is language independent. There are forks in Krygz https://github.com/UlutSoftLLC/MamtilTTS and Catalan https://huggingface.co/projecte-aina/matxa-tts-cat-multiaccent . So perhaps someone who has trained on a Chinese dataset can chip into the conversation.
Just to confirm, did you see this page? https://github.com/shivammehta25/Matcha-TTS/wiki/Training-%F0%9F%8D%B5-Matcha%E2%80%90TTS-with-different-dataset-&-languages
Hello author, thank you for your anwser !!! In addition, I am deeply sorry that I have been ill recently and have not seen your message. there should be no big problems with your code and model. It because I use a wrong way coding Chinese to phonemes . Your project can indeed be applied to Chinese,But what I trained model generate wavs was very noisy,
I trained the model on a chinese dataset AISHELL3 ,119 epochs, poor reception
myconfig is:
What do you think is the reason? 1.The number of epochs trained is not enough? 2.Or because the number of spk 174 is too much? 3.each spk"s data is not enough? 4.the n_vocab: 50 of the symbols ,Is there any influence?
how can i improve the synthesis ?
I think the dataset size and training should be enough.
4.the n_vocab: 50 of the symbols ,Is there any influence? Do you really have only 50 symbols? I feel something might be wrong here, what phonemizer are you using?
thank you foryour anwser ,shivammehta25。 It's not International Phonetic Alphabet (IPA), but rather Taiwanese Pinyin, a type of Chinese phoneme with 50 symbols. the model (i trained with AISHELL3) has a bit of human voice but also contains a lot of noise. Previously, I used the Mainland Chinese version of Pinyin, another form of Chinese phonetic notation with over 200 symbols. I suspect that the issue might be due to wrong processing, specifically with the Mainland Chinese version of Pinyin.
May I ask if there is any experiment on Chinese data set? Why I use pinyin as phoneme training on Chinese Mandarin data set, and what I synthesize is all noise?