Open nvadigauvce opened 7 years ago
Try with LSTMs and increase your training data (1000 files maybe less for dnns) if possible
Thanks, @bajibabu I will try with LSTM and increase the training data. But is it related to question set or linguistic features? because when we tried the same data with HMM, it gives very good quality, better than Merlin DNN.
@nvadigauvce ,Can you tell me which front-end do you use, and how to build a TTS system with Merlin for a new language( The language that Festival does not support.). Thanks.
wotulong wrote:
how to build a TTS system with Merlin for a new language
the general principles are covered in this tutorial
http://www.speech.zone/courses/one-off/merlin-interspeech2017/
and we recommend using the Ossian framework to build front-ends for new languages
Simon
-- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
@simonkingedinburgh It's very helpful, thanks very much.
Dear All,
I was trying to train the Hindi TTS, with below configuration. The quality of the system is ok, but not up to the mark. So what may be the tuning parameter I can optimize to improve the quality of Hindi TTS?
I am attaching the wave files synthesized in the zip file. Hindi_wavefiles.zip
No of sentences used in training: 1000 files Vocoder=WORLD SamplingFreq=16000 Architecture used: hidden_layer_size: [512, 512, 512, 512] hidden_layer_type: ['TANH', 'TANH', 'TANH', 'TANH']
I got the below objective results for duration model: Valid -- RMSE: 6.953 frames/phoneme; CORR: 0.746; Test -- RMSE: 6.585 frames/phoneme; CORR: 0.752;
acoustic model: Valid -- MCD: 5.255 dB; BAP: 0.235 dB; F0:- RMSE: 11.295 Hz; CORR: 0.754; VUV: 6.822% Test -- MCD: 5.247 dB; BAP: 0.244 dB; F0:- RMSE: 12.003 Hz; CORR: 0.757; VUV: 6.111%
Regards, Nagaraj