NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
11.95k stars 2.48k forks source link

vad_infer.py script fails with `IndexError` if option `--dont_auto_split` is not set #2545

Closed PeganovAnton closed 3 years ago

PeganovAnton commented 3 years ago

Describe the bug

NeMo/examples/asr/vad_infer.py script fails with IndexError if option --dont_auto_split is not set and at least 1 .wav file is long enough so that nemo.collections.asr.parts.utils.vad_utils.prepare_manifest() function split the .wav file.

Steps/Code to reproduce bug

mkdir -p ~/debug_data
wget http://i13pc106.ira.uka.de/~jniehues/IWSLT-SLT/data/eval/en-de/IWSLT-SLT.tst2019.en-de.tgz -O ~/debug_data/IWSLT-SLT.tst2019.en-de.tgz
tar xzf ~/debug_data/IWSLT-SLT.tst2019.en-de.tgz -C ~/debug_data/
cd ~/NeMo/examples/asr/
wget https://raw.githubusercontent.com/NVIDIA/NeMo/feat/asr/iwslt_audio_to_nemo_format/scripts/dataset_processing/prepare_iwslt_audio_data.py
python prepare_iwslt_audio_data.py -a ~/debug_data/IWSLT.tst2019/wavs/ -t ~/debug_data/IWSLT.tst2019/IWSLT.TED.tst2019.en-de.en.xml -o ~/debug_data/IWSLT.tst2019/manifest.json
python vad_infer.py --dataset ~/debug_data/IWSLT.tst2019/manifest.json --out_dir ~/debug_data/IWSLT.tst2019/vad --vad_model vad_marblenet

Expected behavior

No errors

Environment overview (please complete the following information)

Environment details

If NVIDIA docker image is used you don't need to specify these. Otherwise, please provide:

fayejf commented 3 years ago

Thank you so much! @PeganovAnton https://github.com/NVIDIA/NeMo/pull/2546 is approved