using language model in pre-trained model

flashlight / wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit

https://github.com/facebookresearch/wav2letter/wiki

Other

6.39k stars 1.01k forks source link

using language model in pre-trained model #44

Closed saisrinivas047 closed 6 years ago

saisrinivas047 commented 6 years ago

in the command "luajit ~/wav2letter/test.lua ~/librispeech-glu-highdropout.bin -progress -show -test dev-clean -save -datadir ~/librispeech-proc/ -dictdir ~/librispeech-proc/ -gfsai"

will the kenlm language model be included automatically or we need to any parameter to the above command

vineelpratap commented 6 years ago

No test.lua only stores the emissions from the model. You need to run decode.lua command mentioned in Running the Decoder (Inference) section to use a language model. https://github.com/facebookresearch/wav2letter#running-the-decoder-inference

saisrinivas047 commented 6 years ago

@vineelpratap the decode command "luajit ~/wav2letter/decode.lua ~/experiments/hello_librispeech dev-clean -show -letters ~/librispeech-proc/letters-rep.lst -words ~/dict.lst -lm ~/3-gram.pruned.3e-7.arpa -lmweight 3.1639 -beamsize 25000 -beamscore 40 -nthread 10 -smearing max -show" contains experiments/hello_librispeech folder is that necessary because I dont have that folder and what should I replace it with ?

ambigus9 commented 6 years ago

@saisrinivas047 As I understand It must be replaced with any folder name, it doesn't matter. What It matters is that the folder that you specify must contain output-dev-clean.bin output-dev-clean.idx transitions-dev-clean.bin

Hope to help you.