视频识别没问题，音频识别报错，好像是要求16k采样率才能用？

modelscope / FunClip

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

MIT License

3.71k stars 403 forks source link

视频识别没问题，音频识别报错，好像是要求16k采样率才能用？ #18

Closed bbeyondllove closed 8 months ago

bbeyondllove commented 8 months ago

prediction = await anyio.to_thread.run_sync(

File "C:\Python310\lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "C:\Python310\lib\site-packages\anyio_backends_asyncio.py", line 2144, in run_sync_in_worker_thread return await future File "C:\Python310\lib\site-packages\anyio_backends_asyncio.py", line 851, in run result = context.run(func, args) File "C:\Python310\lib\site-packages\gradio\utils.py", line 689, in wrapper response = f(args, **kwargs) File "D:\gopath\src\ParaClipper\paraclipper\launch.py", line 22, in audio_recog return audio_clipper.recog(audio_input, sd_switch, hotwords=hotwords)
File "D:\gopath\src\ParaClipper\paraclipper\videoclipper.py", line 28, in recog assert sr == 16000, "16kHz sample rate required, {} given.".format(sr)
AssertionError: 16kHz sample rate required, 32000 given.

R1ckShi commented 8 months ago

请拉取最新代码重试，现在audio自动通过librosa进行重采样

bbeyondllove commented 8 months ago

可以了谢谢。

yingw commented 7 months ago

我这里重新采样也没用，报 AttributeError: 'NoneType' object has no attribute 'format' ，在：

File "/home/yinguowei/AI/FunClip/funclip/launch.py", line 21, in audio_recog return audio_clipper.recog(audio_input, sd_switch, hotwords=hotwords) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/yinguowei/AI/FunClip/funclip/videoclipper.py", line 32, in recog logging.warning("Input wav shape: {}, only first channel reserved.").format(data.shape) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'format'

好像是data类型不行。

后来尝试把音频转成 wav、16000采样率、单通道，就是不重新采样才可以

escalate007 commented 6 months ago

我这里重新采样也没用，报 AttributeError: 'NoneType' object has no attribute 'format' ，在：

File "/home/yinguowei/AI/FunClip/funclip/launch.py", line 21, in audio_recog return audio_clipper.recog(audio_input, sd_switch, hotwords=hotwords) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/yinguowei/AI/FunClip/funclip/videoclipper.py", line 32, in recog logging.warning("Input wav shape: {}, only first channel reserved.").format(data.shape) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'format'

好像是data类型不行。

后来尝试把音频转成 wav、16000采样率、单通道，就是不重新采样才可以

我也这样做的