Open loretoparisi opened 3 years ago
cc @vineelpratap @xuqiantong
Hi,
To run inference please follow the commands here -https://github.com/facebookresearch/wav2letter/tree/master/recipes/mls#decoding using the latest docker from flashlight repo.
We don't provide pre trained models only for offline ASR and not for streaming ASR (https://github.com/facebookresearch/wav2letter/issues/920) and hence simple_streaming_asr_example
cannot be used..
@vineelpratap thanks! So as I enter the docker container then I run commands for decoding, but I see two different syntax here. For beam search, we have;
/flashlight/build/bin/asr/fl_asr_decode --flagsfile=decode/[lang].cfg
While for viterbi decoding we have
fl_asr_test --am=[...]/am.bin --lexicon=[...]/train_lexicon.txt --datadir=[...] --test=test.lst --tokens=[...]/tokens.txt --emission_dir='' --nouselexicon --show
Why?
That's true.
fl_asr_test
is for viterbi decoding while fl_asr_decode
is for beam search decoding with a Language Model. If you just care about getting the best WER, please use the latter.
Hello,
I don't know if my following question is the kind of question that is proper to ask in Github. Still, since I have been fighting the last days with this I decided myself to ask. I am trying to learn how to train a speech recognition system in spanish using Python and I found about wav2letter in the following link https://ai.facebook.com/blog/a-new-open-data-set-for-multilingual-speech-research/, whiche led me here https://github.com/facebookresearch/wav2letter/tree/master/recipes/mls . I downloaded the proper files and I tried to follow the USAGE STEPS in wav2letter/recipes/mls/README.md
Hi, yes, once you build flashlight (https://github.com/facebookresearch/flashlight#building-and-installing), it'll build the binaries for decoding. You can then use the commands mentioned in the MLS recipe to run decoding...
@vineelpratap is it possible to build using the provided Dockerfile
here
and then using the MLS recipe to run the decoder in the same way?
Yes, that is also possible~
Question
To provide examples of inference with MLS pretrained tokens&lexicon and acoustic and language models showing
Additional Context
Currently a example command to run a wav2letter inference with the latest docker image is the following
sudo docker run --rm -v ~:/root/host/ -it --ipc=host --name w2l -a stdin -a stdout -a stderr wav2letter/wav2letter:inference-latest sh -c "cat /root/host/audio/LibriSpeech/dev-clean/777/126732/777-126732-0070.flac.wav | /root/wav2letter/build/inference/inference/examples/simple_streaming_asr_example --input_files_base_path /root/host/model"
I have recently built a simpler docker image to run wav2vec inference here It would be cool to have a simple pipeline for MLS/wav2letter as well!