srvk / eesen

The official repository of the Eesen project
http://arxiv.org/abs/1507.08240
Apache License 2.0
822 stars 343 forks source link

output of train network and input of decoding lattice #115

Open Sundy1219 opened 7 years ago

Sundy1219 commented 7 years ago

I looked at your paper "EESEN:end-to-end speech recognition using deep rnn models and wfst-based decoding" for many times. I still don't understand why posterior normalization is needed during decoding . Question 1 can you explain it in detail ? Question 2 isn't it softmax probability produced when Wav features are send to the trained network ?, why is "dir/label count" needed? Question 3 Is the input parameter softmax probability for latgen-faster Looking forward to your reply @fmetze @riebling