modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
https://www.funasr.com
Other
5.99k stars 647 forks source link

asr inference #1149

Open wwfcnu opened 9 months ago

wwfcnu commented 9 months ago

你好,请教一个问题: 比如我在推理的时候设置batch_size=8,指的是在处理一条语音时,同时解码的token数,还是同时进行8条语音推理?

wwfcnu commented 9 months ago

我理解是对同一条语音进行解码加速,不是同时处理好几条语音 @LauraGPT

LauraGPT commented 9 months ago

https://alibaba-damo-academy.github.io/FunASR/en/egs_modelscope/asr/TEMPLATE/README_zh.html#id4

wwfcnu commented 9 months ago

我看了文档,是不是只有gpu解码的时候才会设置多个batch_size,cpu解码使用默认的batch_size=1

wwfcnu commented 9 months ago

https://alibaba-damo-academy.github.io/FunASR/en/egs_modelscope/asr/TEMPLATE/README_zh.html#id4

对于单个长音频的解码,设置batch_size是可以加速的是吧