JarodMica / StyleTTS-WebUI

MIT License
51 stars 18 forks source link

A Different transcribe and process error #23

Closed edbartz closed 2 months ago

edbartz commented 2 months ago

I have tried installing 3 times now, the last time following along step by step (I am running a 3070TI 8gb card, 30Min wave file). I always get this same error. I can't see what file it is looking for, that it can't find. (venv) PS C:\Users\xxx.xxx\Desktop\TTS\StyleTTS-WebUI> python webui.py C:\Users\xxx.xxx\Desktop\TTS\StyleTTS-WebUI\venv\Lib\site-packages\pyannote\audio\core\io.py:43: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call. torchaudio.set_audio_backend("soundfile") Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). preprocessor_config.json: 100%|██████████████████████████████████████████████████| 340/340 [00:00<?, ?B/s] config.json: 100%|███████████████████████████████████████████████████████████| 2.39k/2.39k [00:00<?, ?B/s] vocabulary.json: 100%|███████████████████████████████████████████████| 1.07M/1.07M [00:00<00:00, 8.96MB/s] tokenizer.json: 100%|████████████████████████████████████████████████| 2.48M/2.48M [00:00<00:00, 11.8MB/s] model.bin: 100%|█████████████████████████████████████████████████████| 3.09G/3.09G [01:29<00:00, 34.4MB/s] No language specified, language will be first be detected for each audio file (increases inference time). Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.4.0. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint C:\Users\xxx.xxx\.cache\torch\whisperx-vad-segmentation.bin Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0+cu102, yours is 2.3.1+cu121. Bad things might happen unless you revert torch to 1.x. Loaded Whisper model Traceback (most recent call last): File "C:\Users\xxx.xxx\Desktop\TTS\StyleTTS-WebUI\venv\Lib\site-packages\gradio\queueing.py", line 536, in process_events response = await route_utils.call_process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\xxx.xxx\Desktop\TTS\StyleTTS-WebUI\venv\Lib\site-packages\gradio\route_utils.py", line 321, in call_process_api output = await app.get_blocks().process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\xxx.xxx\Desktop\TTS\StyleTTS-WebUI\venv\Lib\site-packages\gradio\blocks.py", line 1935, in process_api result = await self.call_function( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\xxx.xxx\Desktop\TTS\StyleTTS-WebUI\venv\Lib\site-packages\gradio\blocks.py", line 1520, in call_function prediction = await anyio.to_thread.run_sync( # type: ignore ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\xxx.xxx\Desktop\TTS\StyleTTS-WebUI\venv\Lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\xxx.xxx\Desktop\TTS\StyleTTS-WebUI\venv\Lib\site-packages\anyio_backends_asyncio.py", line 2177, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "C:\Users\xxx.xxx\Desktop\TTS\StyleTTS-WebUI\venv\Lib\site-packages\anyio_backends_asyncio.py", line 859, in run result = context.run(func, args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\xxx.xxx\Desktop\TTS\StyleTTS-WebUI\venv\Lib\site-packages\gradio\utils.py", line 826, in wrapper response = f(args, *kwargs) ^^^^^^^^^^^^^^^^^^ File "C:\Users\xxx.xxx\Desktop\TTS\StyleTTS-WebUI\venv\Lib\site-packages\gradio\utils.py", line 826, in wrapper response = f(args, kwargs) ^^^^^^^^^^^^^^^^^^ File "C:\Users\xxx.xxx\Desktop\TTS\StyleTTS-WebUI\webui.py", line 355, in transcribe_other_language_proxy file_durations = [get_duration(os.path.join(chosen_directory, item)) for item in items if os.path.isfile(os.path.join(chosen_directory, item))] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\xxx.xxx\Desktop\TTS\StyleTTS-WebUI\webui.py", line 355, in file_durations = [get_duration(os.path.join(chosen_directory, item)) for item in items if os.path.isfile(os.path.join(chosen_directory, item))] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\xxx.xxx\Desktop\TTS\StyleTTS-WebUI\modules\tortoise_dataset_tools\audio_conversion_tools\split_long_file.py", line 12, in get_duration duration_output = subprocess.check_output(['ffprobe', '-v', 'error', '-show_entries', 'format=duration', '-of', 'default=noprint_wrappers=1:nokey=1', file_path]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\xxx.xxx\AppData\Local\Programs\Python\Python311\Lib\subprocess.py", line 466, in check_output return run(popenargs, stdout=PIPE, timeout=timeout, check=True, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\xxx.xxx\AppData\Local\Programs\Python\Python311\Lib\subprocess.py", line 548, in run with Popen(popenargs, kwargs) as process: ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\xxx.xxx\AppData\Local\Programs\Python\Python311\Lib\subprocess.py", line 1026, in init self._execute_child(args, executable, preexec_fn, close_fds, File "C:\Users\xxx.xxx\AppData\Local\Programs\Python\Python311\Lib\subprocess.py", line 1538, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: [WinError 2] The system cannot find the file specified

JarodMica commented 2 months ago

Missing ffmpeg. Not your fault, I have it missing from the instructions and haven't updated it yet.

Check duplicate: https://github.com/JarodMica/StyleTTS-WebUI/issues/11

edbartz commented 2 months ago

Thanks alot for the help. I do have it running now, though I also needed the fixes you mentioned for espeak. On the other hand, 8GB is painful. I had to breakdown and order a 3090, just for this.

JarodMica commented 2 months ago

Glad to hear it! I'll have to update the instructions then to add that espeak fix comment then. That is the unfortunate reality of running these tools, but you'll be glad you did once you swap.