How can we exploit forced alignments?

Thank you so much for the work you have done in your tacotron implementation. I have a question if you may. I have a speech corpus with time alignments. For each audio sample, I have a file that looks like this.

0.471000 121 sil 0.618000 121 Z 0.666000 121 i 0.716750 121 n 0.852974 121 a: 0.910125 121 z 0.987444 121 a 1.070000 121 t 1.130000 121 u 1.182000 121 l

What is the best tacotron implementation that can exploit this information?

Kyubyong / tacotron

How can we exploit forced alignments? #125