Open yygg678 opened 5 days ago
I have thoroughly reviewed the project documentation and read the related paper(s).
All details are given in our paper, including used training corpus for small model, batchsize, evaluation results from 400~800k updates. Train with same batchsize to approx. 200K updates will hear something intelligible.
Checks
Question details
Using my own phone sequence, I trained the model from scratch, with about 200 hours of Chinese data and a 155M model. The synthesized speech is completely incomprehensible. How much data is generally needed to train a model from scratch?