ZhengkunTian / rnn-transducer

A Pytorch Implementation of Transducer Model for End-to-End Speech Recognition
231 stars 56 forks source link

A question about beam search #1

Closed Marcovaldong closed 5 years ago

Marcovaldong commented 5 years ago

@ZhengkunTian Thanks for your code. I have readen your blog about the rnn-transducer.

I want to know how did you implement the procedure of beam searching. I implemented this procedure using PyTorch by referring HawkAaron's implementation, but I found the beam search is too slow to use. So I wonder to know your implementation and your speed.

I'll be very appreciate if you can answer me.

ZhengkunTian commented 5 years ago

I’m so sorry to answer it so late. It's so hard to achieve a batch beam search method. In my current implementation, I utilize a greedy search or decode sentences one by one with beam search. In my experiments, I find that beam search just bring a little improvement. Therefore, I don't achieve an efficient beam search code. Thanks for your attention.

Marcovaldong commented 5 years ago

@ZhengkunTian Thanks for your reply and congratulations for your papaers accepted by interspeech2019. The focus of my problem is not how to implement the beam search on a mini-batch, but how to implement it on a single sample and how its speed and performance.

On the other hand, I tried to implement shallow fusion in greedy decoding, can we do that? I found my implementation of shallow fusion with PyTorch is invalid. However I added my language model which was implemented by MxNet in shallow fusion with greedy decoding, there is an improvement of 0.2%. So I want to know the performance of shallow fusion in your experiments.