Closed MultipleOneLeaf closed 1 month ago
See DEBUG.md
@MultipleOneLeaf
I added more details how to check that.
What's the exact username you have? (you can see by entering echo %username%
) in cmd.exe
)
Thanks!!
I created a bug in whisper.cpp
repo.
The only missing info is os information with cpu / gpu. can you add it? using the following command in cmd.exe
:
winget install neofetch
neofetch
Just for transparency's sake, on the first result it didn't detect the GPU, but after doing winsat formal
it was correctly detected. I then restarted the pc and retried transcribing with Vibe - it still crashes.
Note: while it says Windows 11, the problem also occurred while it was Windows 10.
@MultipleOneLeaf
I think that it's related to an old issue in whisper.cpp
which some CPUs does not support the f16c
instruction used by whisper.cpp
I compiled vibe with f16c
disabled, I think that should solve the crash, at least on some PCs.
~You can download it from vibe_1.0.7_x64-setup_no_f16c.exe~
By the way, you can simply ensure if your cpu supports f16c
instruction by running:
set RUST_LOG=vibe=trace
%localappdata%\vibe\vibe.exe
It will print it in the first message logs
Unfortunately that didn't fix it for me. As can be seen in the log, my cpu supports f16c
:
C:\Users\AutumnLeaf>[2024-05-22T19:17:42Z DEBUG vibe_desktop] Vibe App Running
[2024-05-22T19:17:42Z DEBUG vibe_desktop::setup] webview version: 125.0.2535.51
[2024-05-22T19:17:42Z DEBUG vibe_desktop::setup] CPU supports f16c: true
@MultipleOneLeaf
Per the exception in event viewer you posted, the exception code: 0xc000001d
means the program wants from the cpu to do instruction which is unsupported by.
I compiled vibe again with avx2
disabled.
~You can download it from vibe_1.0.7_x64-setup_no_avx2.exe~
Also added many logs to see what features the cpu support You can see by
set RUST_LOG=vibe=trace
%localappdata%\vibe\vibe.exe
Also didn't work. Here's the log:
@MultipleOneLeaf
That looks better!
As I thought your CPU doesn't support avx2
I believe that for some reason I compiled it incorrectly and avx2
still used
~Maybe this one [vibe_1.0.7_x64-setup_no_avx_avx2_fma_f16c.exe]~ ~(https://github.com/thewh1teagle/vibe/releases/download/v1.0.7/vibe_1.0.7_x64-setup_no_avx_avx2_fma_f16c.exe) (also added more important line to log)~
~Install then~ ~console~ ~set RUST_LOG=vibe=debug,whisper_rs=debug~ ~%localappdata%\vibe\vibe.exe~ ~
~
Updated one with almost all cpu features disabled:
vibe_1.0.7_x64-setup_no_avx_fma_f16c.exe
set RUST_LOG=vibe=debug,whisper_rs=debug
%localappdata%\vibe\vibe.exe
It works now! Here's the log just in case:
Should I close the issue or is there anything else you'd like me to test?
Amazing! Thanks for your patience :)
I think I'll disable automatic updates for computers that don't support those instructions (avx
, fma
, etc.). Also, instead of letting whisper.cpp
crash the whole program, I'll just print an error with a link to download this working executable.
Should I close the issue or is there anything else you'd like me to test?
Let's wait to see if others have suggestions.
It's interesting what the speed that you get when transcribing without those instructions (which related to optimization) You can see the time it takes in the console after finish transcribing (right click -> inspect element)
Amazing! Thanks for your patience :)
Thank you!
speed that you get when transcribing
On the sample you mentioned above (11 seconds of sound), it took 19 seconds.
Amazing! Thanks for your patience :)
Thank you!
speed that you get when transcribing
On the sample you mentioned above (11 seconds of sound), it took 19 seconds.
Meaning that one hour audio will take 1 hour and 30-40 minutes. on my CPU with optimization (no gpu) it will take 1 hour and 20-25 minutes.
With your NVIDIA GeForce RTX 3060
it should be much faster. does the task manager indicate it uses the gpu while transcribing?
I released new installer for Nvidia GPUs! If you interested you can try from vibe_1.0.7_x64-setup_nvidia_no_avx2_fma.exe
It's pretty heavy (300-500mb) though you don't have to install it. you can extract it with 7-zip and start vibe.exe
directly.
Can't test it since I don't have Nvidia
GPU, so I hope it will work :)
Just tested with my GPU. I extracted in case it ends up making any difference in speed.
The sample file took 3 seconds.
I also tried then transcribing a file with 20 minutes duration - though probably only about 10 minutes worth of speech - and it took 89-90 seconds. Doing it with a different language (the one in this specific file, instead of English) it took 96 seconds.
The time with CPU, from the experiment my brother did with Ryzen 5 2600 before I opened this issue, was about 40 minutes I believe.
That's amazing! It's nearly as fast as high-quality transcriptions from paid APIs. Now, anyone with Nvidia GPUs can easily turn their PC into a high-quality transcription machine :)
Yup. These results might be a good form of advertisement to increase interest in your project.
And, in my opinion, the best part is that it is also run locally.
Thank you once again for you efforts!
Hello. I am using an RTX 4070 Ti Super with 7800X3D. Do I also need to install the CUDA Toolkit to use this Nvidia version you have linked? https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64&target_version=11&target_type=exe_local
I tried the default version of Vibe, but it ran on the CPU when transcribing. However, the new Nvidia version you have linked in this reply just crashes a few seconds after clicking transcribe.
@manav0619 If your GPU is Nvidia, you might not need to install the CUDA Toolkit. It's better to troubleshoot why it's crashing. Could you check DEBUG.md and share the details here?
In the section of trying original whisper you can try the following version: whisper-cublas-12.2.0-bin-x64.zip
@thewh1teagle Thank you for the quick response, and apologies for being late. I am pasting below the details of the two events from Event Viewer related to the latest crash. Please let me know if you require anything else.
@manav0619 Not sure why it happens from the log of event viewer, maybe try to get the logs from
@thewh1teagle
@manav0619
Looks like the error happens in whisper.cpp
engine when it tries to load the model.
Maybe you do need to install cuda toolkit.
Try to download cudnn (direct download link)
And CUDA Toolkit (direct download link)
Also download newer version of Vibe
vibe_1.0.7_x64-setup_nvidia.exe
If you didn't updated the driver of your GPU then also download rtx-4070_ti_super-studio-driver (direct link)
@thewh1teagle I installed both CUDA Toolkit and cudnn, but it's still crashing.
I would also like to mention that another tool named Whisper GUI works fine and does use the GPU. I transcribed two files with it even before installing Vibe and CUDA, and had no problems. Edit: Just now, I tried the v3 model on Faster Whisper GUI to transcribe an hour-long file, which is also working fine and used 100% of the GPU.
@manav0619 Looks like previously it happened in Whisper Faster too. try to open the advanced options in main window and set the temperature to 0.
Reported in whisper.cpp/issues/2187
@thewh1teagle Hello. I believe there was an update in the application, which I let it install. It says vibe 1.0.9 in the settings. I am not sure if this is still the Nvidia-optimized version. But anyway, I am able to transcribe the files in this version, independent of the temperature setting. This time, it seems to be utilizing both the CPU and GPU.
The process is still much slower compared to the Faster Whisper GUI I linked above, which utilizes 90+% of the GPU, with GPU temperature remaining at 50-53 °C. Vibe's GPU usage is fluctuating around 70-80%, with GPU temperature remaining at 47-48 °C.
Both of these screenshots are when transcribing the same 70-minute audio file using Whisper large-v3. Vibe took almost half an hour. Faster Whisper GUI was done in around 2 minutes.
Great to know that at least you it doesn't crash anymore!
By default vibe uses opencl
GPU optimization, which is pretty good in general, but not close to Nvidia.
The nvidia builds very heavy, that's why I don't include them by default.
but I compiled new version of Vibe for 1.0.9
with Nvidia optimization enabled.
It should work much faster, should be even close to faster-whisper
Try it from vibe_1.0.9_x64-setup_nvidia_whisper_1.6.2.exe
I added checks if the cpu is supported and if not vibe will display error message with instructions how to fix. In addition there's new release 2.0 with nvidia (see the readme of vibe)
What happened?
After I hit "Transcribe", the app crashes.
Since, at least from what I've noticed, there are no crash logs, here's what the cmd log shows:
log
``` C:\Windows\System32>[2024-05-22T01:57:23Z DEBUG vibe_desktop] Vibe App Running [2024-05-22T01:57:24Z DEBUG vibe_desktop::setup] webview version: 125.0.2535.51 [2024-05-22T01:57:33Z DEBUG vibe::model] Transcribe called with { "path": "D:\\2023-11-30 11-25-58.mkv", "model_path": "C:\\Users\\AutumnLeaf\\AppData\\Local\\github.com.thewh1teagle.vibe\\ggml-medium.bin", "lang": "en", "verbose": false, "n_threads": 4, "init_prompt": "", "temperature": 0.4 } [2024-05-22T01:57:33Z DEBUG vibe::audio] input is D:\2023-11-30 11-25-58.mkv and output is C:\Users\AUTUMN~1\AppData\Local\Temp\.tmptp0lsV.wav [2024-05-22T01:57:33Z DEBUG vibe::audio::encoder] decoder channel layout is 2 [2024-05-22T01:57:33Z DEBUG vibe::audio::encoder] +-----------+ | in |default--[48000Hz fltp:stereo]--auto_aresample_0:default | (abuffer) | +-----------+ +---------------+ Parsed_anull_0:default--[16000Hz s16:mono]--default| out | | (abuffersink) | +---------------+ +----------------+ auto_aresample_0:default--[16000Hz s16:mono]--default| Parsed_anull_0 |default--[16000Hz s16:mono]--out:default | (anull) | +----------------+ +------------------+ in:default--[48000Hz fltp:stereo]--default| auto_aresample_0 |default--[16000Hz s16:mono]--Parsed_anull_0:default | (aresample) | +------------------+ [2024-05-22T01:57:33Z DEBUG vibe::audio] wav reader read from "C:\\Users\\AUTUMN~1\\AppData\\Local\\Temp\\.tmptp0lsV.wav" [2024-05-22T01:57:33Z DEBUG vibe::audio] parsing C:\Users\AUTUMN~1\AppData\Local\Temp\.tmptp0lsV.wav [2024-05-22T01:57:33Z DEBUG vibe::model] open model... ```Steps to reproduce
Tried on Windows 10 first, and now on Windows 11. Also, my brother's pc (Windows 10) for some reason has no issue with Vibe.
Could it be because I've got an old cpu (i7-3770 vs my brother's Ryzen 5 2600) or motherboard or something like that that there's some conflict happening? In case it helps, I am able to run Stable Diffusion locally without issues (both Automatic 1111 and ComfyUI).
I should also note that the user shown on the logs of Vibe is
AUTUMN~1
instead ofAutumnLeaf
. Since I've seen a few past issues here due to hebraic user names, perhaps the tilde (which shouldn't even be there) could be a cause.What OS are you seeing the problem on?
Window
logs
### Relevant log output ```shell App Version: 1.0.7 Commit Hash: 99ae746dc02135ad7a27ec0f9adafe016b8c96e4 Arch: x86_64 Platform: windows Kernel Version: 10.0.22631 OS: windows OS Version: 10.0.22631 Models: ggml-medium.bin Default Model: "C:\\Users\\AutumnLeaf\\AppData\\Local\\github.com.thewh1teagle.vibe\\ggml-medium.bin" ```