NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
12.14k stars 2.53k forks source link

Flashlight decoder for ASR #9103

Closed Mark-Serin closed 5 months ago

Mark-Serin commented 6 months ago

At this page you describe how to use flashlight decoder by launching your python script. But I can't find any parameters it should have. In further NeMo versions you have installation guide but no python script to use it.

Can you please help me with that? This feature (flashlight decoder) is still in progress or deprecated?

titu1994 commented 6 months ago

Yes it's supported, please find installation instructions here - https://docs.nvidia.com/nemo-framework/user-guide/latest/nemotoolkit/asr/asr_language_modeling.html#n-gram-language-modeling

That bash script has to be run by users as is, rather than python to install flashlight

Mark-Serin commented 6 months ago

I followed this guide but it refers to scrips that don't exist anymore. For example the following script is not relevant already:

python eval_beamsearch_ngram.py nemo_model_file=<path to the .nemo file of the model> \
       input_manifest=<path to the evaluation JSON manifest file \
       kenlm_model_file=<path to the binary KenLM model> \
       beam_width=[<list of the beam widths, separated with commas>] \
       beam_alpha=[<list of the beam alphas, separated with commas>] \
       beam_beta=[<list of the beam betas, separated with commas>] \
       preds_output_folder=<optional folder to store the predictions> \
       probs_cache_file=null \
       decoding_mode=beamsearch_ngram \
       decoding_strategy="<Beam library such as beam, pyctcdecode or flashlight>"

because there is no eval_breamsearch_ngram.py here.

titu1994 commented 6 months ago

@karpnv could you update docs ?

Mark-Serin commented 6 months ago

NeMo of previous versions has it but there are no such parameters as decoding_strategy. I mean it can't be used as flashlight. And even in older versions of NeMo it's absent. For example in 1.18.0

github-actions[bot] commented 5 months ago

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 5 months ago

This issue was closed because it has been inactive for 7 days since being marked as stale.