netease-youdao / EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Apache License 2.0
6.63k stars 556 forks source link

tokens like uncased15 - uncased99 mean what? #78

Open huangxu1991 opened 6 months ago

huangxu1991 commented 6 months ago

image

huangxu1991 commented 6 months ago

And I found [LAUGH] token, which I guess it can produce laugh. To my dispointment, it never works after several trials.

Ccj0221 commented 6 months ago

"Hello, I have encountered the same issue as you. It seems that this does not affect the outcome. Additionally, when using MFA to generate phonemes, I accidentally skipped a wav file. I used code to force skip it and continued running the subsequent code. During training, it prompted missing fields in the lexicon similar to 'uu1' and 'uu4', as well as missing fields for '1' and '9'. I replaced uncased* with appropriate values to complete the dictionary and successfully ran the training. Yesterday, I trained for 30000 steps but obtained mediocre results with electronic sounds present in the generated audio. The speed of the audio also needs to be accelerated to around 225%."

syq163 commented 6 months ago

image

These tokens don't have any practical meaning, you can just ignore them.