NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
11.62k stars 2.44k forks source link

`speech_to_text_aed_chunked_infer.py` throws `TypeError: 'TranscriptionConfig' object is not iterable` #10246

Open kunibald413 opened 1 month ago

kunibald413 commented 1 month ago

Describe the bug

canary model chunked infer with speech_to_text_aed_chunked_infer.py throws this error: TypeError: 'TranscriptionConfig' object is not iterable

Steps/Code to reproduce bug

run this main function: https://github.com/NVIDIA/NeMo/blob/7cc99e95fa753f46dffffc47a19e3c1fa375159c/examples/asr/asr_chunked_inference/aed/speech_to_text_aed_chunked_infer.py#L118

throws at this line https://github.com/NVIDIA/NeMo/blob/7cc99e95fa753f46dffffc47a19e3c1fa375159c/examples/asr/asr_chunked_inference/aed/speech_to_text_aed_chunked_infer.py#L122

misc might be legacy code?

might be that it will also throw here: https://github.com/NVIDIA/NeMo/blob/7cc99e95fa753f46dffffc47a19e3c1fa375159c/examples/asr/asr_chunked_inference/rnnt/speech_to_text_buffered_infer_rnnt.py#L149

nithinraok commented 5 days ago

@stevehuang52 pls have a look at this.

stevehuang52 commented 5 days ago

Thanks for pointing out the issue, it will be fixed in https://github.com/NVIDIA/NeMo/pull/10581