srvk / eesen

The official repository of the Eesen project
http://arxiv.org/abs/1507.08240
Apache License 2.0
824 stars 342 forks source link

Difference between decode-faster vs. latgen-faster #159

Open stefbraun opened 6 years ago

stefbraun commented 6 years ago

Hi,

I try do decode the output of an acoustic model (CTC) built in pytorch with the eesen framework. On WSJ, I achieve good results with the decode-faster function as in decode_ctc.sh. However, there is a newer decode_ctc_lat.sh using latgen-faster and a scoring script.

What is the difference between these methods? Will lattice-based decoding improve the results? If yes, do you have any numbers for orientation?

Thanks a lot for sharing the eesen framework, this is extremely helpful.

fmetze commented 6 years ago

Hi, in most cases lattice based decoding will improve results, it gives better time alignments and allows you to specify a word insertion penalty. It will also give you word confidences.

Does your Pytorch mode follow some of the other recipes, e.g. the Tensorflow ones? We’d be interested to see a comparison between the frameworks.

Florian

On Dec 4, 2017, at 7:23 AM, stefbraun notifications@github.com wrote:

Hi,

I try do decode the output of an acoustic model (CTC) built in pytorch with the eesen framework. On WSJ, I achieve good results with the decode-faster function as in decode_ctc.sh https://github.com/srvk/eesen/blob/4038ad3330b3178e2446cf7a8dc3afe7533fc0ec/asr_egs/wsj/steps/decode_ctc.sh. However, there is a newer decode_ctc_lat.sh https://github.com/srvk/eesen/blob/4038ad3330b3178e2446cf7a8dc3afe7533fc0ec/asr_egs/wsj/steps/decode_ctc_lat.sh using latgen-faster and a scoring script.

What is the difference between these methods? Will lattice-based decoding improve the results? If yes, do you have any numbers for orientation?

Thanks a lot for sharing the eesen framework, this is extremely helpful.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/srvk/eesen/issues/159, or mute the thread https://github.com/notifications/unsubscribe-auth/AEnA8fR2LfOANMXSw4vjzuv6eSqcwzaAks5s8-Q9gaJpZM4Q0kme.

ad349 commented 6 years ago

@stefbraun Could you explain how did you use your pytorch trained model with decode-faster ? I have trained a model using CTC loss in tensorflow , I pipe the logits to decode-faster but my WFST only out puts 'wow'. Is there some transformation or a particular way to output the logits to make this work ?

stefbraun commented 6 years ago

@fmetze sorry for the super-late answer. I cannot share my PyTorch ASR pipeline at the moment, but I wrote up some LSTM benchmarks between PyTorch, TensorFlow, Lasagne and Keras that might be helpful:

PCerles commented 5 years ago

Hey @stefbraun, any plans to open source the PyTorch decode-faster pipeline? Non-prefix-beam-search decoding code for CTC, integrated with Pytorch, isn't anywhere people can contribute to right now. I'm working on it and will put it up when I figure it out.