syhw / wer_are_we

Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.
1.86k stars 226 forks source link

TIMIT update #20

Open vanekyj opened 6 years ago

vanekyj commented 6 years ago

Hi, recently, we have done some improvements on TIMIT:

Average PER 15.58% (15.08% min.) on the core test set. fMLLR, 4x1024 LSTM, http://arxiv.org/abs/1806.07974 It is going to be presented at TSD2018 next week.

Further, we had boosted the result by NN ensembled and by the regularization post-layer in SPECOM 2018 (18.-22. September). Average PER 14.84% (14.69% min.) https://arxiv.org/abs/1806.07186

In addition, we share ready-to-try python scripts here: https://github.com/OrcusCZ/NNAcousticModeling

To be fair, we had found a nice result of average PER 14.9% by Ravanelli with fMLLR and a M-reluGRU based NN, https://arxiv.org/abs/1710.00641

Thanks, Jan