cmusphinx / g2p-seq2seq

G2P with Tensorflow
Other
670 stars 194 forks source link

Grapheme-phoneme alignment output #49

Closed DanielAsher closed 8 years ago

DanielAsher commented 8 years ago

Hi,

is it possible to output grapheme-phoneme alignment data from this model?

many thanks,

Daniel

nshmyrev commented 8 years ago

This thing does not build alignment unfortunately or luckily, its seq2seq model, it maps input sequence to a single vector and then decoder turns result into another sequence. http://arxiv.org/pdf/1409.3215v3.pdf

This way you don't need input alignment.

It is possible to visualize attention focus on every decoding step, but it's not really an alignment.

DanielAsher commented 8 years ago

Hi Nickolay,

I really appreciate the link!

I do require Letter-Phoneme alignment data output (though there may be a better technical term for this facility). Something like:

aback a-AH b-B a-AE c-@ k-K or a}AH b}B a}AE c|k}k

would work. Could this be done with "[visualizing] attention focus on every decoding step" ?

I believe https://github.com/AdolfVonKleist/Phonetisaurus handles this, but wanted to know if this was possible with g2p-seq2seq. Any advice on where to start hacking on this would be very valuable.

Warm regards,

Daniel

nshmyrev commented 8 years ago

If you want alignment simply use phonetisaurus or reimplement an algorithm from it.