I have a speech dataset. I want to use pretrained quartznet with language model

gopesh97 commented 4 years ago

Describe your question I have a speech dataset. I want to use pre-trained quartznet with a language model, most preferably transformer-XL or kenLM as suggested in the blogpost.

quartznet = nemo_asr.models.EncDecCTCModel.from_pretrained(model_name="QuartzNet15x5Base-En") transcripts = quartznet.transcribe(paths2audio_files=files)

I want to know what code changes do I need to make integrate language model in prediction and how to train a corresponding Transformer XL for using with quartznet.

Environment overview (please complete the following information)

Environment location: Bare-metal
Method of NeMo install: !python -m pip install git+https://github.com/NVIDIA/NeMo.git@main#egg=nemo_toolkit[all]

Environment details

OS version Ubuntu 18.04
Python version 3.6

Additional context Example: Nvidia- rtx 2080

okuchaiev commented 4 years ago

currently we aren't planning to port transfomer-xl LM rescoring. However, we do plan to work on adding this kind of Transformer LM https://github.com/NVIDIA/NeMo/issues/1126

Closing as duplicate feature request

gopesh97 commented 4 years ago

Can't I use any language model?

NVIDIA / NeMo

I have a speech dataset. I want to use pretrained quartznet with language model #1175