Generate prompt through Reference Audio "Unsupported backend 'ffmpeg' specified; ", "please select one of ['sox', 'soundfile'] instead."

Jonathan-Wei commented 2 days ago

Self Checks

[X] This is only for bug report, if you would like to ask a question, please head to Discussions.
[X] I have searched for existing issues search for existing issues, including closed ones.
[X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[X] [FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
[X] Please do not modify this template :) and fill in all the required fields.

Cloud or Self Hosted

Self Hosted (Source)

Steps to reproduce

Hello,guys! I uploaded Reference Audio and Reference Text using the web UI and generated them, but the web UI displayed an error message/ Inference.ipynb shows the following exception：

Traceback (most recent call last):
  File "/usr/local/miniconda3/envs/fish-speech/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
  File "/usr/local/miniconda3/envs/fish-speech/lib/python3.10/site-packages/gradio/route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
  File "/usr/local/miniconda3/envs/fish-speech/lib/python3.10/site-packages/gradio/blocks.py", line 1935, in process_api
    result = await self.call_function(
  File "/usr/local/miniconda3/envs/fish-speech/lib/python3.10/site-packages/gradio/blocks.py", line 1520, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/usr/local/miniconda3/envs/fish-speech/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/usr/local/miniconda3/envs/fish-speech/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread
    return await future
  File "/usr/local/miniconda3/envs/fish-speech/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 859, in run
    result = context.run(func, *args)
  File "/usr/local/miniconda3/envs/fish-speech/lib/python3.10/site-packages/gradio/utils.py", line 826, in wrapper
    response = f(*args, **kwargs)
  File "/data/fish-speech/tools/webui.py", line 198, in inference_wrapper
    _, audio_data, error_message = next(result)
  File "/usr/local/miniconda3/envs/fish-speech/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 36, in generator_context
    response = gen.send(None)
  File "/data/fish-speech/tools/webui.py", line 85, in inference
    prompt_tokens = encode_reference(
  File "/data/fish-speech/tools/api.py", line 112, in encode_reference
    reference_audio_content = load_audio(
  File "/data/fish-speech/tools/api.py", line 94, in load_audio
    waveform, original_sr = torchaudio.load(
  File "/usr/local/miniconda3/envs/fish-speech/lib/python3.10/site-packages/torchaudio/_backend/utils.py", line 204, in load
    backend = dispatcher(uri, format, backend)
  File "/usr/local/miniconda3/envs/fish-speech/lib/python3.10/site-packages/torchaudio/_backend/utils.py", line 111, in dispatcher
    return get_backend(backend_name, backends)
  File "/usr/local/miniconda3/envs/fish-speech/lib/python3.10/site-packages/torchaudio/_backend/utils.py", line 36, in get_backend
    raise ValueError(
ValueError: ("Unsupported backend 'ffmpeg' specified; ", "please select one of ['sox', 'soundfile'] instead.")

web ui

✔️ Expected Behavior

Normal operation and return of audio data

❌ Actual Behavior

The web UI prompts an exception. No data returned

leng-yue commented 2 days ago

Are you using official container? We should have ffmpeg installed.

Stardust-minus commented 2 days ago

run apt install ffmpeg

Jonathan-Wei commented 48 minutes ago

run apt install ffmpeg

I installed it using conda through source code. ffmpeg and libsox-dev have both been installed.

(fish-speech) root@feixin:/data/fish-speech# apt install libsox-dev ffmpeg
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
libsox-dev is already the newest version (14.4.2+git20190427-4build4).
ffmpeg is already the newest version (7:6.1.1-3ubuntu5).
The following packages were automatically installed and are no longer required:
  gir1.2-nm-1.0 ipset libipset13 python3-cap-ng python3-firewall python3-nftables
Use 'apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 128 not upgraded.

leng-yue commented 30 minutes ago

run apt install ffmpeg

I installed it using conda through source code. ffmpeg and libsox-dev have both been installed.

(fish-speech) root@feixin:/data/fish-speech# apt install libsox-dev ffmpeg
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
libsox-dev is already the newest version (14.4.2+git20190427-4build4).
ffmpeg is already the newest version (7:6.1.1-3ubuntu5).
The following packages were automatically installed and are no longer required:
  gir1.2-nm-1.0 ipset libipset13 python3-cap-ng python3-firewall python3-nftables
Use 'apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 128 not upgraded.

Is the issue solved then?

fishaudio / fish-speech