C0untFloyd / bark-gui

🔊 Text-Prompted Generative Audio Model with Gradio
MIT License
674 stars 63 forks source link

Cloned voice/custom prompt with "use coarse history" failing to generate voice #11

Closed seruko11 closed 1 year ago

seruko11 commented 1 year ago

I used Bark-GUI to clone a prompt from an audio sample, that worked great. When I try to create speech from text using the custom voice I get the following error, I am able to create arbitrary audio from text using the pre-built prompts. This error only happens when I have "use coarse history" checked. Possibly this is a bark-gui problem.

Generating Text (1/1) -> custom\MeMyselfAndI:Hello Sir, How can I help you today? Traceback (most recent call last): File "C:\Users\toor\Desktop\Ai\seait_installers_version_0.1.4\bark-gui\venv\Lib\site-packages\gradio\routes.py", line 399, in run_predict output = await app.get_blocks().process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\toor\Desktop\Ai\seait_installers_version_0.1.4\bark-gui\venv\Lib\site-packages\gradio\blocks.py", line 1299, in process_api result = await self.call_function( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\toor\Desktop\Ai\seait_installers_version_0.1.4\bark-gui\venv\Lib\site-packages\gradio\blocks.py", line 1022, in call_function prediction = await anyio.to_thread.run_sync( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\toor\Desktop\Ai\seait_installers_version_0.1.4\bark-gui\venv\Lib\site-packages\anyio\to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\toor\Desktop\Ai\seait_installers_version_0.1.4\bark-gui\venv\Lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "C:\Users\toor\Desktop\Ai\seait_installers_version_0.1.4\bark-gui\venv\Lib\site-packages\anyio_backends_asyncio.py", line 867, in run result = context.run(func, args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\toor\Desktop\Ai\seait_installers_version_0.1.4\bark-gui\venv\Lib\site-packages\gradio\helpers.py", line 588, in tracked_fn response = fn(args) ^^^^^^^^^ File "C:\Users\toor\Desktop\Ai\seait_installers_version_0.1.4\bark-gui\webui.py", line 114, in generate_text_to_speech audio_array = generate_audio(text, selected_speaker, text_temp, waveform_temp) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\toor\Desktop\Ai\seait_installers_version_0.1.4\bark-gui\bark\api.py", line 113, in generate_audio out = semantic_to_waveform( ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\toor\Desktop\Ai\seait_installers_version_0.1.4\bark-gui\bark\api.py", line 54, in semantic_to_waveform coarse_tokens = generate_coarse( ^^^^^^^^^^^^^^^^ File "C:\Users\toor\Desktop\Ai\seait_installers_version_0.1.4\bark-gui\bark\generation.py", line 592, in generate_coarse round(x_coarse_history.shape[-1] / len(x_semantic_history), 1) AssertionError

C0untFloyd commented 1 year ago

I couldn't duplicate this error so far, though I was using the suggested input audio length of only ~ 4 seconds. Did you and @Deadstarr perhaps use a very long audio clip as input? How big is your .npz file? Could one of you perhaps attach the .npz file here? On the other hand, currently the voice cloning is not worth the hassle, just have a look here.

C0untFloyd commented 1 year ago

Closing this as I can't reproduce.