Improving the quality of Synthesized speech for Hindi

nvadigauvce commented 7 years ago

Dear All,

I was trying to train the Hindi TTS, with below configuration. The quality of the system is ok, but not up to the mark. So what may be the tuning parameter I can optimize to improve the quality of Hindi TTS?

I am attaching the wave files synthesized in the zip file. Hindi_wavefiles.zip

No of sentences used in training: 1000 files Vocoder=WORLD SamplingFreq=16000 Architecture used: hidden_layer_size: [512, 512, 512, 512] hidden_layer_type: ['TANH', 'TANH', 'TANH', 'TANH']

I got the below objective results for duration model: Valid -- RMSE: 6.953 frames/phoneme; CORR: 0.746; Test -- RMSE: 6.585 frames/phoneme; CORR: 0.752;

acoustic model: Valid -- MCD: 5.255 dB; BAP: 0.235 dB; F0:- RMSE: 11.295 Hz; CORR: 0.754; VUV: 6.822% Test -- MCD: 5.247 dB; BAP: 0.244 dB; F0:- RMSE: 12.003 Hz; CORR: 0.757; VUV: 6.111%

Regards, Nagaraj

bajibabu commented 7 years ago

Try with LSTMs and increase your training data (1000 files maybe less for dnns) if possible

nvadigauvce commented 7 years ago

Thanks, @bajibabu I will try with LSTM and increase the training data. But is it related to question set or linguistic features? because when we tried the same data with HMM, it gives very good quality, better than Merlin DNN.

wotulong commented 7 years ago

@nvadigauvce ，Can you tell me which front-end do you use, and how to build a TTS system with Merlin for a new language( The language that Festival does not support.). Thanks.

simonkingedinburgh commented 7 years ago

wotulong wrote:

how to build a TTS system with Merlin for a new language

the general principles are covered in this tutorial

http://www.speech.zone/courses/one-off/merlin-interspeech2017/

and we recommend using the Ossian framework to build front-ends for new languages

Simon

-- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

wotulong commented 7 years ago

@simonkingedinburgh It's very helpful, thanks very much.

CSTR-Edinburgh / merlin

Improving the quality of Synthesized speech for Hindi #165