Any Suggestions to introduce pauses (Up or down) in the produced speech?

as-ideas / TransformerTTS

🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.

https://as-ideas.github.io/TransformerTTS/

Other

1.13k stars 227 forks source link

Any Suggestions to introduce pauses (Up or down) in the produced speech? #62

Open oyeamit opened 4 years ago

oyeamit commented 4 years ago

First of all, Great Work! Thanks for sharing the repo!

I have trained the autoregressive model on LJ dataset. The output is quite good for short sentences. I seek some advice to manipulate pauses between words in the produced speech. Let's say the produced speech is 'This is Text to Speech model.' I want to increase(or say decrease) the pause between the word Speech model little bit.

Any Suggestions?

yutian-wang commented 4 years ago

this code use a teacher-student mechanism. the autoregressive model just used as a teacher, to generate phoneme durations to train the student forward model. so you maight to modify the coressponding code about the duration extraction

cfrancesco commented 4 years ago

Hi, yes you will want to train a forward model for this. There you can easily directly control the duration of each phoneme