Kyubyong / tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Apache License 2.0
1.83k stars 436 forks source link

How can we exploit forced alignments? #125

Open yoosif0 opened 5 years ago

yoosif0 commented 5 years ago

Thank you so much for the work you have done in your tacotron implementation. I have a question if you may. I have a speech corpus with time alignments. For each audio sample, I have a file that looks like this.

0.471000 121 sil 0.618000 121 Z 0.666000 121 i 0.716750 121 n 0.852974 121 a: 0.910125 121 z 0.987444 121 a 1.070000 121 t 1.130000 121 u 1.182000 121 l

What is the best tacotron implementation that can exploit this information?