zhaomaoniu / nonebot-plugin-gpt-sovits

✨ NoneBot2 Plugin for GPT-SoVITS ✨
MIT License
7 stars 1 forks source link

BUG: 无法正确读取音频文件 #2

Closed Well2333 closed 1 month ago

Well2333 commented 1 month ago

出现错误的代码

Nonebot Plugin GPT Sovits utils.py L46

def get_wav_duration(wav_bytes: bytes) -> float:
    with SoundFile(io.BytesIO(wav_bytes)) as f:
        return len(f) / f.samplerate

您在代码中使用了 io.BytesIO(wav_bytes) 来替代一个本地文件传入 SoundFile,但并未指定其音频格式,这将会导致 SoundFile 无法识别音频格式:

soundfile.LibsndfileError: Error opening<io.BytesI0 object at 0x00000242BD47E020>: Format not recognised.

此问题出现的原因在于传入 BytesIO 时未指定格式,对此python-soundfile也有说明:

        If a file is opened with `mode` ``'r'`` (the default) or
        ``'r+'``, no sample rate, channels or file format need to be
        given because the information is obtained from the file. An
        exception is the ``'RAW'`` data format, which always requires
        these data points.

为解决此问题,您可以通过指定音频格式来修改代码:

def get_wav_duration(wav_bytes: bytes) -> float:
    with SoundFile(io.BytesIO(wav_bytes), format="WAV") as f:
        return len(f) / f.samplerate

或考虑将字节写入文件后再使用 SoundFile

最后,建议您在代码修改后进行基本的功能测试,以确保问题得到有效解决。

zhaomaoniu commented 1 month ago

经我测试,您给出的出现错误的代码在 Windows Server 2016 上并没有出现 soundfile.LibsndfileError。但按照您的意见修改代码后,出现了另一个错误:

TypeError: Not allowed for existing files (except 'RAW'): samplerate, channels, format, subtype, endian

因此我在 ff57a279bd741582a209d1f20d7e59bfed196c2c 中将 soundfile 替换为了 pydub,代码修改如下:

def get_wav_duration(wav_bytes: bytes) -> float:
    audio = AudioSegment.from_file(io.BytesIO(wav_bytes), format="wav")
    return len(audio) / 1000

测试后未出现问题,能劳烦您更新至 v0.1.3 后测试一下问题是否依然存在吗?

linya72 commented 1 month ago

版本更新后遇FFmpeg异常 e802de2f66232fcc6fe2c37bed6b30f0

zhaomaoniu commented 1 month ago

版本更新后遇FFmpeg异常

能否提供一下使用的 GPT-SoVITS 版本?

Tokiichika commented 1 month ago

作者您好,我在使用最新的0.1.3版本时也遇到了ffmpeg的错误,完整日志如下:

09-13 00:39:52 [SUCCESS] nonebot | OneBot V11 1754277314 | [message.group.normal]: Message 405282424 from 1279478673@[群:962464091] '智乃说 私はチノです、喫茶店ラビットハウスのマスターの孫です。 -e 0 '
09-13 00:39:53 [INFO] nonebot | Event will be handled by AlconnaMatcher(type='', module=nonebot_plugin_gpt_sovits, lineno=50)
09-13 00:39:53 [INFO] nonebot | AlconnaMatcher(type='', module=nonebot_plugin_gpt_sovits, lineno=50) running complete
09-13 00:39:53 [ERROR] nonebot | Running AlconnaMatcher(type='', module=nonebot_plugin_gpt_sovits, lineno=50) failed.
Traceback (most recent call last):
  File "<string>", line 17, in <module>
  File "C:\Users\Administrator\Desktop\Chino-Bot\ChinoBot\.venv\Lib\site-packages\nonebot\__init__.py", line 335, in run    get_driver().run(*args, **kwargs)
  File "C:\Users\Administrator\Desktop\Chino-Bot\ChinoBot\.venv\Lib\site-packages\nonebot\drivers\fastapi.py", line 186, in run
    uvicorn.run(
  File "C:\Users\Administrator\Desktop\Chino-Bot\ChinoBot\.venv\Lib\site-packages\uvicorn\main.py", line 577, in run
    server.run()
  File "C:\Users\Administrator\Desktop\Chino-Bot\ChinoBot\.venv\Lib\site-packages\uvicorn\server.py", line 65, in run
    return asyncio.run(self.serve(sockets=sockets))
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\asyncio\runners.py", line 194, in run
    return runner.run(main)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\asyncio\runners.py", line 118, in run
    return self._loop.run_until_complete(task)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\asyncio\base_events.py", line 674, in run_until_complete
    self.run_forever()
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\asyncio\windows_events.py", line 322, in run_forever
    super().run_forever()
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\asyncio\base_events.py", line 641, in run_forever
    self._run_once()
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\asyncio\base_events.py", line 1987, in _run_once
    handle._run()
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\asyncio\events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "C:\Users\Administrator\Desktop\Chino-Bot\ChinoBot\.venv\Lib\site-packages\nonebot\message.py", line 476, in check_and_run_matcher
    await _run_matcher(
> File "C:\Users\Administrator\Desktop\Chino-Bot\ChinoBot\.venv\Lib\site-packages\nonebot\message.py", line 428, in _run_matcher
    await matcher.run(bot, event, state, stack, dependency_cache)
  File "C:\Users\Administrator\Desktop\Chino-Bot\ChinoBot\.venv\Lib\site-packages\nonebot\internal\matcher\matcher.py", line 850, in run
    await self.simple_run(bot, event, state, stack, dependency_cache)
  File "C:\Users\Administrator\Desktop\Chino-Bot\ChinoBot\.venv\Lib\site-packages\nonebot\internal\matcher\matcher.py", line 825, in simple_run
    await handler(
  File "C:\Users\Administrator\Desktop\Chino-Bot\ChinoBot\.venv\Lib\site-packages\nonebot\dependencies\__init__.py", line 94, in __call__
    return await cast(Callable[..., Awaitable[R]], self.call)(**values)
  File "C:\Users\Administrator\Desktop\Chino-Bot\ChinoBot\.venv\Lib\site-packages\nonebot_plugin_gpt_sovits\__init__.py", line 131, in handle_tts
    duration = int(get_wav_duration(wav_file))
  File "C:\Users\Administrator\Desktop\Chino-Bot\ChinoBot\.venv\Lib\site-packages\nonebot_plugin_gpt_sovits\utils.py", line 46, in get_wav_duration
    audio = AudioSegment.from_file(io.BytesIO(wav_bytes), format="wav")
  File "C:\Users\Administrator\Desktop\Chino-Bot\ChinoBot\.venv\Lib\site-packages\pydub\audio_segment.py", line 773, in from_file
    raise CouldntDecodeError(
pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 3199971767

Output from ffmpeg/avlib:

ffmpeg version 7.0.1-full_build-www.gyan.dev Copyright (c) 2000-2024 the FFmpeg developers
  built with gcc 13.2.0 (Rev5, Built by MSYS2 project)
  configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libuavs3d --enable-libxevd --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxeve --enable-libxvid --enable-libaom --enable-libjxl --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-dxva2 --enable-d3d11va --enable-d3d12va --enable-ffnvcodec --enable-libvpl --enable-nvdec --enable-nvenc --enable-vaapi --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libcodec2 --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
  libavutil      59.  8.100 / 59.  8.100
  libavcodec     61.  3.100 / 61.  3.100
  libavformat    61.  1.100 / 61.  1.100
  libavdevice    61.  1.100 / 61.  1.100
  libavfilter    10.  1.100 / 10.  1.100
  libswscale      8.  1.100 /  8.  1.100
  libswresample   5.  1.100 /  5.  1.100
  libpostproc    58.  1.100 / 58.  1.100
[wav @ 000001c4c66d0c80] invalid start code [123][34]de in RIFF header
[cache @ 000001c4c66d1240] Statistics, cache hits:0 cache misses:1
[in#0 @ 000001c4c66d0880] Error opening input: Invalid data found when processing input
Error opening input file cache:pipe:0.
Error opening input files: Invalid data found when processing input

我是直接使用了来自这里的240821版本的整合包,运行其中的api_v2.py,nonebot端报错时vits的后台也出现了404的错误: image

zhaomaoniu commented 1 month ago

插件编写时, api_v2.py 还未出现,导致了报错。

很抱歉让您无法正常使用插件,我将尽快适配 api_v2.py

在这之前,您可以尝试用 api.py 对接本插件,它同样支持加载 v2 的模型。

Tokiichika commented 1 month ago

感谢,换用v1后插件工作正常。 不过也许是v1的api不支持v2的一些特性?虽然它支持加载v2的模型,但推理的效果不如webui。 总之,感谢您的辛勤付出,期待您对v2 api的适配!

zhaomaoniu commented 1 month ago

@linya72 @Tokiichika 请将插件更新到 v0.2.0 后,根据 README 重新进行配置。插件现在应该能够正常工作。