facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.22k stars 6.38k forks source link

[ASR] Is lexicon-free W2lKenLMDecoder beam search decoding possible with no external language model? #5402

Open abarcovschi opened 9 months ago

abarcovschi commented 9 months ago

Hello, I am wondering whether it is possible to run lexicon-free beam search decoding using W2lKenLMDecoder without an external language model?

I am trying to initialise my decoder with the following code:

        beam_size = 500
        beam_threshold = 25.0

        decoder_args = {
            'kenlm_model': '',
            'beam': beam_size,
            'beam_threshold': beam_threshold,
            'sil_weight': 0.0,
            'lexicon': '',
            'unit_lm': True,
        }
        decoder_args = Namespace(**decoder_args)
        self.decoder = W2lKenLMDecoder(decoder_args, target_dict)

I am running into the error: No such file or directory while opening on this line in fairseq2/examples/speech_recognition/w2l_decoder.py: self.lm = KenLM(args.kenlm_model, self.word_dict) The KenLM class does not accept an empty string as input, so it seems like I am required to provide a path to some language model binary file.

Taking inspiration from https://github.com/flashlight/wav2letter/wiki/Beam-Search-Decoder , is there an option to use a "Zero LM" proxy object in lieu of an actual external language model?

Also, there is the following line in fairseq2/examples/speech_recognition/w2l_decoder.py: assert args.unit_lm, "lexicon free decoding can only be done with a unit language model". What does a "unit" language model mean in this context? Is it different from a "zero/proxy" LM?

My conclusion at the moment is that it is only possible to run beam search decoding using W2lKenLMDecoder with an external language model, but both with or without a lexicon file. However, I am wondering if its possible to run decoding without an external language model also.

If anyone could provide some feedback on this issue I would be very grateful!