flashlight / wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit
https://github.com/facebookresearch/wav2letter/wiki
Other
6.37k stars 1.01k forks source link

How much iteration will I train to reproduce similar WER which paper show on dataset librispeech with conv-glu network? #640

Closed KellyZhao960510 closed 4 years ago

lunixbochs commented 4 years ago

I reproduced the approximate librispeech TER from the original paper in around 40 epochs, it's the 1.6GB epoch 40 model here: https://talonvoice.com/research/

If you're training new English models these days, you should definitely try streaming convnets - it's a much better architecture. Faster, smaller, more accurate, and better supports online streaming. Some of my results with streaming convnets on English are here: https://github.com/facebookresearch/wav2letter/issues/577