Closed Milincho closed 1 year ago
Hi @Milincho, that's weird. Are you using webui.bat
to open the Gradio server?
Also, if you are updating from version 2.x, make sure to delete the venv
directory in your whisper-auto-transcribe directory first
Installed whisper-timestamped and now it says this:
Sorry, you don't need to "install" anything except execute webui.bat
.
It will run a virtual environment and install all dependencies you need.
Ok, did that:
But even the web UI doesn't work. It doesn't import mkv files, and when I try to add a mp4 file it says "Error" on 'Initiate job"
When I uninstall whisper_timestamped it goes back to the same: ModuleNotFoundError: No module named 'whisper_timestamped'
Even with WebUI loaded:
@Milincho, You need to enable the python virtual environment in command mode. Additionally, could you provide more information such as the error code in webgui.bat, the filename, subtitle name, and other settings related to the GUI error?
I add the file:
The video loads and plays ok:
I click on "Submit". Default options & 'translate':
Click 'Initiate job' and:
and this is the info on the cmd window:
It's the 'Vocal Extractor' option that gives the error:
Without this option the GUI version works. It shows several warnings, but it runs with both CPU and CUDA:
BUT I am interested in making the CLI version work. I always use the CLI and not the GUI version.
@Milincho Thanks for your information. I will take a look at it.
fix by v3.1, this is how update to the latest version:
enable_venv.bat
Didn't work for me sadly, even deleted the repo and recloned but getting
(venv) C:\Users\san-a\Downloads\tools\whisper-auto-transcribe>python .\cli.py "F:\j\abc-123\abc-123.HD.mp4" --output "F:\j\abc-123\xxx.srt" -lang ja --task translate --model large --device cuda
Important: the default model was recently changed to `htdemucs` the latest Hybrid Transformer Demucs model. In some cases, this model can actually perform worse than previous models. To get back the old default model use `-n mdx_extra_q`.
Selected model is a bag of 1 models. You will see that many progress bars per track.
Separated tracks will be stored in C:\Users\san-a\Downloads\tools\whisper-auto-transcribe\tmp\tmpviibozmx\htdemucs
Separating track C:\Users\san-a\AppData\Local\Temp\tmpviibozmx
Could not load file C:\Users\san-a\AppData\Local\Temp\tmpviibozmx. Maybe it is not a supported file format?
When trying to load using ffmpeg, got the following error: FFmpeg is not installed.
When trying to load using torchaudio, got the following error: Error opening 'C:\\Users\\san-a\\AppData\\Local\\Temp\\tmpviibozmx': File contains data in an unknown format.
Traceback (most recent call last):
File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe\src\utils\task.py", line 83, in transcribe
subprocess.run(cmd, check=True)
File "C:\Users\san-a\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 524, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'demucs --two-stems=vocals "C:\Users\san-a\AppData\Local\Temp\tmpviibozmx" -o "./tmp/tmpviibozmx/" --filename "{stem}.{ext}"' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe\cli.py", line 106, in <module>
cli()
File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe\cli.py", line 90, in cli
subtitle_path = transcribe(
File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe\src\utils\task.py", line 85, in transcribe
raise Exception(
Exception: Error. Vocal extracter unavailable. Received: demucs --two-stems=vocals "C:\Users\san-a\AppData\Local\Temp\tmpviibozmx" -o "./tmp/tmpviibozmx/" --filename "{stem}.{ext}"
Error Code: Command 'demucs --two-stems=vocals "C:\Users\san-a\AppData\Local\Temp\tmpviibozmx" -o "./tmp/tmpviibozmx/" --filename "{stem}.{ext}"' returned non-zero exit status 1.
that is despite me running the webui.bat
and enable_venv.bat
.
Ok this is weird, I got the error above, then within the same cmd "session", i ran the webui.bat
and then reran the command, this time CUDA is running out of memory. Didn't happen with previous version.
Important: the default model was recently changed to `htdemucs` the latest Hybrid Transformer Demucs model. In some cases, this model can actually perform worse than previous models. To get back the old default model use `-n mdx_extra_q`.
Selected model is a bag of 1 models. You will see that many progress bars per track.
Separated tracks will be stored in C:\Users\san-a\Downloads\tools\whisper-auto-transcribe\tmp\tmp15_8s0zt\htdemucs
Separating track C:\Users\san-a\AppData\Local\Temp\tmp15_8s0zt
Traceback (most recent call last):
File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe\venv\Scripts\demucs-script.py", line 33, in <module>
sys.exit(load_entry_point('demucs==4.0.0', 'console_scripts', 'demucs')())
File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe\venv\lib\site-packages\demucs\separate.py", line 163, in main
sources = apply_model(model, wav[None], device=args.device, shifts=args.shifts,
File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe\venv\lib\site-packages\demucs\apply.py", line 171, in apply_model
out = apply_model(sub_model, mix, **kwargs)
File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe\venv\lib\site-packages\demucs\apply.py", line 196, in apply_model
shifted_out = apply_model(model, shifted, **kwargs)
File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe\venv\lib\site-packages\demucs\apply.py", line 211, in apply_model
weight = th.cat([th.arange(1, segment // 2 + 1, device=device),
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Traceback (most recent call last):
File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe\src\utils\task.py", line 83, in transcribe
subprocess.run(cmd, check=True)
File "C:\Users\san-a\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 524, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'demucs --two-stems=vocals "C:\Users\san-a\AppData\Local\Temp\tmp15_8s0zt" -o "./tmp/tmp15_8s0zt/" --filename "{stem}.{ext}"' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe\cli.py", line 106, in <module>
cli()
File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe\cli.py", line 90, in cli
subtitle_path = transcribe(
File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe\src\utils\task.py", line 85, in transcribe
raise Exception(
Exception: Error. Vocal extracter unavailable. Received: demucs --two-stems=vocals "C:\Users\san-a\AppData\Local\Temp\tmp15_8s0zt" -o "./tmp/tmp15_8s0zt/" --filename "{stem}.{ext}"
Error Code: Command 'demucs --two-stems=vocals "C:\Users\san-a\AppData\Local\Temp\tmp15_8s0zt" -o "./tmp/tmp15_8s0zt/" --filename "{stem}.{ext}"' returned non-zero exit status 1.
I have 48GB of ram and a 4090 so not sure what's going on, but in HWInfo I can see that it did spike in terms of load
And then if i close the CMD window, open a new one and try to rerun the command, i get the error from my previous comment. Does this mean everytime we open the command line and run the venv
script, we also then have to run webui
to reinstall all the dependencies?
Does this mean everytime we open the command line and run the venv script, we also then have to run webui to reinstall all the dependencies?
@amerodeh
No, the web UI is only executed once if you use the command line. This error was caused because I forgot to add the FFmpeg path to the command line.
In addition, kindly create a new issue instead of reporting it in a closed issue.
It worked now in my first test:
Getting this error with 3.0 Alpha: ModuleNotFoundError: No module named 'whisper_timestamped'