HawkAaron / E2E-ASR

PyTorch Implementations for End-to-End Automatic Speech Recognition
126 stars 27 forks source link

Questions about results #3

Closed ZhengkunTian closed 5 years ago

ZhengkunTian commented 5 years ago

Hello Mingkun: Firstly, thank you for contributing the code. I want to know if your ctc model and rnn transducer have achieved the results in Alex Graves' paper. Before that, my own ctc model without any LM achived PER 21 on TIMIT, but it's far from Alex, I also run your code followed as your default params and achieve PER 22. I am so confused about that. It would be great if you could give me some advice. Best regards, Zhengkun Tian

HawkAaron commented 5 years ago

I had run the full experiments using MXNet Gluon https://github.com/HawkAaron/RNN-Transducer/tree/graves2013

I found that gluon results always better than pytorch. Maybe you can check about the phone set? BTW, beam search for ctc would be little better.

ZhengkunTian commented 5 years ago

Thanks a lot!

PeiyanFlying commented 4 years ago

Hello friends,

I have run the whole project on TIMIT dataset, but it's far from the results PER 21. And when I run the eval.py, it will generate some phonemes, such as 'cl', not included in our phone.map. So add them manually.

It's that right?

Or shuold I train CTC model first, and then use CTC model to tain RNNT model?

Thanks and Best.