Open Youyoun opened 5 years ago
@Youyoun Were you able to improve your WER?
Hey,
It's been a while since I worked on speech to text. From memory the best WER I ever reached was around 7%, but I don't remember the exact model parameters. I based it off mainly on the SpecAugment paper.
What really helped take down that WER below 10% was bigger batches. For that to work I used gradient accumulation (I think that my batch size was 32 on 1 GPUs, and with accum grad I took it to 512). Pretty easy to implement.
It took me 2 weeks to train on single GPU.
Hope this helps.
Hi,
Does anyone have a config file that works wonders for training the ASR model on librispeech 960h ? I can't seem to get it to the ~4% WER promised by many research papers. My best so far is above 10%. Clearly with the tools provided by this repository, there must be a way to reach that much WER.