Closed benob closed 6 years ago
Hi,
dict.lst is produced by ~/wav2letter/data/utils/convert-arpa.lua script (README describes how to run it).
WER reported in the paper is based on 4-gram language model. The following parameters should produce 4.3% WER on librispeech dev-clean.
-lmweight 3.1639 -silweight -0.37491 -beamsize 25000 -beamscore 40
The dict.lst file required for rescoring the librispeech output with a language model seems to be missing from the repository.
I tried to recreate one with the following command
zcat 3-gram.pruned.3e-7.arpa.gz | perl -ne 'chomp;$_=lc;@a=split /\t/;if(/^\\1-grams:/.../^$/){$w=$a[1]; $w=~s/(.)(\1+)/$1.length($2)/e; print "$a[1] $w\n"}' | grep -v "<\|^ *$\|[3-9]" > dict.lst
But I get a WER of 6.73 on dev-clean after rescoring. I would have expected something in the 4-5% as reported in the paper.