BuffMcBigHuge / text-generation-webui-edge-tts

A very simple implementation of edge_tts w/ RVC for oobabooga text-generation-webui.
41 stars 3 forks source link

Traceback when using the rvc_models #3

Closed thecord closed 1 year ago

thecord commented 1 year ago

first thank you so much for the great extension is work like charm with edge, the issue appears I interface when I did try to implement the RVC model please look at the log below:

llama_print_timings:        load time =  9259.74 ms
llama_print_timings:      sample time =     4.87 ms /    22 runs   (    0.22 ms per token,  4514.67 tokens per second)
llama_print_timings: prompt eval time =  9259.64 ms /    80 tokens (  115.75 ms per token,     8.64 tokens per second)
llama_print_timings:        eval time =  5778.39 ms /    21 runs   (  275.16 ms per token,     3.63 tokens per second)
llama_print_timings:       total time = 15094.01 ms
Output generated in 15.55 seconds (1.35 tokens/s, 21 tokens, context 80, seed 1079273127)
Outputting audio to extensions\edge_tts\outputs\1696602833.mp3
Running RVC
gin_channels: 256 self.spk_embed_dim: 109
Model loaded
Audio duration: 5.64s
Traceback (most recent call last):
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\extensions\edge_tts\script.py", line 330, in tts
    audio_opt = vc.pipeline(
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\extensions/edge_tts\vc_infer_pipeline.py", line 398, in pipeline
    self.vc(
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\extensions/edge_tts\vc_infer_pipeline.py", line 251, in vc
    (net_g.infer(feats, p_len, pitch, pitchf, sid)[0][0, 0])
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\extensions/edge_tts\lib\infer_pack\models.py", line 752, in infer
    m_p, logs_p, x_mask = self.enc_p(phone, pitch, phone_lengths)
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\extensions/edge_tts\lib\infer_pack\models.py", line 97, in forward
    x = self.emb_phone(phone) + self.emb_pitch(pitch)
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\installer_files\env\lib\site-packages\torch\nn\modules\linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: expected scalar type Half but found Float

Traceback (most recent call last):
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\installer_files\env\lib\site-packages\gradio\routes.py", line 427, in run_predict
    output = await app.get_blocks().process_api(
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\installer_files\env\lib\site-packages\gradio\blocks.py", line 1323, in process_api
    result = await self.call_function(
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\installer_files\env\lib\site-packages\gradio\blocks.py", line 1067, in call_function
    prediction = await utils.async_iteration(iterator)
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\installer_files\env\lib\site-packages\gradio\utils.py", line 336, in async_iteration
    return await iterator.__anext__()
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\installer_files\env\lib\site-packages\gradio\utils.py", line 329, in __anext__
    return await anyio.to_thread.run_sync(
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\installer_files\env\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\installer_files\env\lib\site-packages\anyio\_backends\_asyncio.py", line 2106, in run_sync_in_worker_thread
    return await future
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\installer_files\env\lib\site-packages\anyio\_backends\_asyncio.py", line 833, in run
    result = context.run(func, *args)
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\installer_files\env\lib\site-packages\gradio\utils.py", line 312, in run_sync_iterator_async
    return next(iterator)
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\modules\chat.py", line 329, in generate_chat_reply_wrapper
    for i, history in enumerate(generate_chat_reply(text, state, regenerate, _continue, loading_message=True)):
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\modules\chat.py", line 297, in generate_chat_reply
    for history in chatbot_wrapper(text, state, regenerate=regenerate, _continue=_continue, loading_message=loading_message):
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\modules\chat.py", line 265, in chatbot_wrapper
    output['visible'][-1][1] = apply_extensions('output', output['visible'][-1][1], state, is_chat=True)
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\modules\extensions.py", line 224, in apply_extensions
    return EXTENSION_MAP[typ](*args, **kwargs)
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\modules\extensions.py", line 82, in _apply_string_extensions
    text = func(*args, **kwargs)
  File "H:\TextGeneration\oobabooga_windows\text-generation-webui\extensions\edge_tts\script.py", line 160, in output_modifier
    wavfile.write(output_file, 44100, audio.astype(np.int16))
AttributeError: 'tuple' object has no attribute 'astype'

please help me to fix the issue and thanks again for the lovely extension 👍

BuffMcBigHuge commented 1 year ago

This happens when there is no RVC .pth file selected. To fix this, either check off RVC or add weights to the rvc_models and select it in the dropdown.

I will make improvements to handle this edge case.