Open quhonglin opened 2 years ago
you need to specify your lexicon file within the command using --lexicon lexicon.txt in your case
subset=test_clean
CUDA_VISIBLE_DEVICES=1 python /home/quhongling/fairseq-main/examples/speech_recognition/infer.py \
/Data/QuHonglin/datasets/wav2vec2/Librispeech/evaluate/100h \
--task audio_finetuning \
--nbest 1 --path /Data/QuHonglin/pre-trained-models/wav2vec_small_100h.pt \
--gen-subset $subset --results-path /Data/QuHonglin/experiments/wav2vec2/Librispeech/evaluate/100h/4-gram-lm/test_clean \
--w2l-decoder kenlm --lm-model /Data/QuHonglin/pre-trained-models/lm_librispeech_kenlm_word_4g_200kvocab.bin \
--lm-weight 2 --word-score -1 --sil-weight 0 --criterion ctc --labels ltr --max-tokens 4000000 \
--post-process letter --lexicon lexicon.txt
same lexicon.txt used in kenlm model it should be like this
EVERY E V E R Y |
WORD W O R D |
THAT T H A T |
EXISTS E X I S T S |
IN I N |
YOUR Y O U R |
LABEL L A B E L |
OR O R |
TRANSCRIPTION T R A N S C R I P T I O N |
FILE F I L E |
WILL W I L L |
WRITE W R I T E |
DOWN D O W N |
LIKE L I K E |
THIS T H I S |
@Abdullah955 Thanks for your reply. But if I want a lexicon free decoding, how should I do?
@quhonglin I think you should get a unit LM, which means that the model use char as unit to build
@quhonglin you need to create your own language model or use a pre-trained one using Kenlm this tutorial should help you,
https://huggingface.co/blog/wav2vec2-with-ngram
you only need text to train your model
Thanks for everyone. But now I no longer use it, and aslo forget some details for this issue. Maybe I'll try again when I have time in the future.
@Abdullah955 I follows the right steps. And the lexicon format is just as above. But when it goes to " self.lm = KenLM(cfg.lmpath, self.word_dict)" .
Segmentation fault (core dumped) occurs.
Can you help me with this? Thanks
The model is data2vec base model.
@quhonglin I have the same question: when I run command ,the hypothesises are all the null, resulting in a wer of 100%. Have you solved this problem, can you help me if you can, thanks a lot!
❓ Questions and Help
Before asking:
What is your question?
When I evaluate a CTC model on wav2cec2.0 according to
fairseq/examples/wav2vec/README.md
, I encountered the following error:Code
Here is the code I'm executing:
And here is the error log:
What have you tried?
When I try
--w2l-decoder viterbi
, it works fine. When I try to add--unit-lm
or--unit-lm --kenlm-model=/Data/QuHonglin/pre-trained-models/lm_librispeech_kenlm_word_4g_200kvocab.bin
, it can work, but the hypothesises are all the null, resulting in a wer of 100%. So how do I use a language model to decoding the Wac2vec2.0-CTC model correctly?What's your environment?