NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
11.93k stars 2.48k forks source link

function in VAD utils hardcoded to only use `.wav` files #6754

Closed riqiang-dp closed 1 year ago

riqiang-dp commented 1 year ago

Describe the bug

This line uses only .wav to split filepaths to create filenames. However if the files are not .wav, this will still run, but with wrong file names which causes later on in the vad_infer.py file to not fine the current files at the last stage where we create the final manifest. Specifically, in here, if the file was .flac, we will be looking for utt.txt whereas the actual file created by the previous steps will be named utt.flac.txt.

Steps/Code to reproduce bug

Expected behavior

It should handle different file formats.

Environment overview (please complete the following information)

N/A

Environment details

N/A

Additional context

N/A

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 1 year ago

This issue was closed because it has been inactive for 7 days since being marked as stale.