jasonppy / VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild
7.51k stars 739 forks source link

Easy to use with gradio client #156

Open gosharevo opened 3 weeks ago

gosharevo commented 3 weeks ago

Hello! I want to use it on my local machine with gradio I tried a lot of options, but I can't understand, how to easy use it via API as easy as it is on web version. Can someone give me an example please? Here is my try

from gradio_client import Client

client = Client("")

result = client.predict(
        "whisper",  # str  in 'Whisper backend' Radio component
        "base.en",  # str  in 'Whisper model' Radio component
        "whisperX", # str  in 'Forced alignment model' Radio component
        "330M", # str  in 'VoiceCraft model' Radio component
result = client.predict(
        5,  # int | float  in 'seed' Number component
        5,  # int | float  in 'left_margin' Number component
        5,  # int | float  in 'right_margin' Number component
        5,  # int | float  in 'codec_audio_sr' Number component
        5,  # int | float  in 'codec_sr' Number component
        5,  # int | float  in 'top_k' Number component
        5,  # int | float  in 'top_p' Number component
        5,  # int | float  in 'temperature' Number component
        "-1",   # str  in 'stop_repetition' Radio component
        5,  # int | float  in 'speech rate' Number component
        "0",    # str  in 'kvcache' Radio component
        "Howdy!",   # str  in 'silence tokens' Textbox component
        "https://github.com/gradio-app/gradio/raw/main/test/test_files/audio_sample.wav",   # str (filepath on your computer (or URL) of file) in 'Input Audio' Audio component
        "Howdy!",   # str  in 'Text' Textbox component
        True,   # bool  in 'Smart transcript' Checkbox component
        0,  # int | float (numeric value between 0 and 7.614) in 'Prompt end time' Slider component
        0,  # int | float (numeric value between 0 and 7.614) in 'Edit from time' Slider component
        0,  # int | float (numeric value between 0 and 7.614) in 'Edit to time' Slider component
        "Newline",  # str  in 'Split text' Radio component
        "Hey",  # str (Option from: []) in 'Sentence' Dropdown component
Loaded as API: ✔️
Traceback (most recent call last):
  File "/home/georgy/Project/VoiceCraft/temp.py", line 12, in <module>
    result = client.predict(
  File "/home/georgy/anaconda3/envs/voicecraft/lib/python3.9/site-packages/gradio_client/client.py", line 292, in predict
    return self.submit(*args, api_name=api_name, fn_index=fn_index).result()
  File "/home/georgy/anaconda3/envs/voicecraft/lib/python3.9/site-packages/gradio_client/client.py", line 1131, in result
    return super().result(timeout=timeout)
  File "/home/georgy/anaconda3/envs/voicecraft/lib/python3.9/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/home/georgy/anaconda3/envs/voicecraft/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/home/georgy/anaconda3/envs/voicecraft/lib/python3.9/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/georgy/anaconda3/envs/voicecraft/lib/python3.9/site-packages/gradio_client/client.py", line 798, in _inner
    predictions = _predict(*data)
  File "/home/georgy/anaconda3/envs/voicecraft/lib/python3.9/site-packages/gradio_client/client.py", line 829, in _predict
    raise ValueError(result["error"])
ValueError: None
Traceback (most recent call last):
  File "/home/georgy/anaconda3/envs/voicecraft/lib/python3.9/site-packages/gradio/queueing.py", line 407, in call_prediction
    output = await route_utils.call_process_api(
  File "/home/georgy/anaconda3/envs/voicecraft/lib/python3.9/site-packages/gradio/route_utils.py", line 226, in call_process_api
    output = await app.get_blocks().process_api(
  File "/home/georgy/anaconda3/envs/voicecraft/lib/python3.9/site-packages/gradio/blocks.py", line 1550, in process_api
    result = await self.call_function(
  File "/home/georgy/anaconda3/envs/voicecraft/lib/python3.9/site-packages/gradio/blocks.py", line 1185, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/home/georgy/anaconda3/envs/voicecraft/lib/python3.9/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/home/georgy/anaconda3/envs/voicecraft/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread
    return await future
  File "/home/georgy/anaconda3/envs/voicecraft/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 859, in run
    result = context.run(func, *args)
  File "/home/georgy/anaconda3/envs/voicecraft/lib/python3.9/site-packages/gradio/utils.py", line 661, in wrapper
    response = f(*args, **kwargs)
  File "/home/georgy/Project/VoiceCraft/gradio_app.py", line 239, in run
    selected_sentence_idx = int(selected_sentence[:colon_position])
ValueError: invalid literal for int() with base 10: 'He'