thewh1teagle / vibe

Transcribe on your own!
https://thewh1teagle.github.io/vibe/
MIT License
442 stars 30 forks source link

Bug: Crash on loading model #79

Closed MultipleOneLeaf closed 1 month ago

MultipleOneLeaf commented 1 month ago

What happened?

After I hit "Transcribe", the app crashes.

Since, at least from what I've noticed, there are no crash logs, here's what the cmd log shows:

log ``` C:\Windows\System32>[2024-05-22T01:57:23Z DEBUG vibe_desktop] Vibe App Running [2024-05-22T01:57:24Z DEBUG vibe_desktop::setup] webview version: 125.0.2535.51 [2024-05-22T01:57:33Z DEBUG vibe::model] Transcribe called with { "path": "D:\\2023-11-30 11-25-58.mkv", "model_path": "C:\\Users\\AutumnLeaf\\AppData\\Local\\github.com.thewh1teagle.vibe\\ggml-medium.bin", "lang": "en", "verbose": false, "n_threads": 4, "init_prompt": "", "temperature": 0.4 } [2024-05-22T01:57:33Z DEBUG vibe::audio] input is D:\2023-11-30 11-25-58.mkv and output is C:\Users\AUTUMN~1\AppData\Local\Temp\.tmptp0lsV.wav [2024-05-22T01:57:33Z DEBUG vibe::audio::encoder] decoder channel layout is 2 [2024-05-22T01:57:33Z DEBUG vibe::audio::encoder] +-----------+ | in |default--[48000Hz fltp:stereo]--auto_aresample_0:default | (abuffer) | +-----------+ +---------------+ Parsed_anull_0:default--[16000Hz s16:mono]--default| out | | (abuffersink) | +---------------+ +----------------+ auto_aresample_0:default--[16000Hz s16:mono]--default| Parsed_anull_0 |default--[16000Hz s16:mono]--out:default | (anull) | +----------------+ +------------------+ in:default--[48000Hz fltp:stereo]--default| auto_aresample_0 |default--[16000Hz s16:mono]--Parsed_anull_0:default | (aresample) | +------------------+ [2024-05-22T01:57:33Z DEBUG vibe::audio] wav reader read from "C:\\Users\\AUTUMN~1\\AppData\\Local\\Temp\\.tmptp0lsV.wav" [2024-05-22T01:57:33Z DEBUG vibe::audio] parsing C:\Users\AUTUMN~1\AppData\Local\Temp\.tmptp0lsV.wav [2024-05-22T01:57:33Z DEBUG vibe::model] open model... ```

Steps to reproduce

  1. Open Vibe
  2. Attempt to Transcribe
  3. Crash when trying to load model

Tried on Windows 10 first, and now on Windows 11. Also, my brother's pc (Windows 10) for some reason has no issue with Vibe.

Could it be because I've got an old cpu (i7-3770 vs my brother's Ryzen 5 2600) or motherboard or something like that that there's some conflict happening? In case it helps, I am able to run Stable Diffusion locally without issues (both Automatic 1111 and ComfyUI).

I should also note that the user shown on the logs of Vibe is AUTUMN~1 instead of AutumnLeaf. Since I've seen a few past issues here due to hebraic user names, perhaps the tilde (which shouldn't even be there) could be a cause.

What OS are you seeing the problem on?

Window

logs ### Relevant log output ```shell App Version: 1.0.7 Commit Hash: 99ae746dc02135ad7a27ec0f9adafe016b8c96e4 Arch: x86_64 Platform: windows Kernel Version: 10.0.22631 OS: windows OS Version: 10.0.22631 Models: ggml-medium.bin Default Model: "C:\\Users\\AutumnLeaf\\AppData\\Local\\github.com.thewh1teagle.vibe\\ggml-medium.bin" ```
thewh1teagle commented 1 month ago

See DEBUG.md

thewh1teagle commented 1 month ago

@MultipleOneLeaf I added more details how to check that. What's the exact username you have? (you can see by entering echo %username%) in cmd.exe)

MultipleOneLeaf commented 1 month ago
  1. Same result. Though I am sure it was still a valid file before since it worked on my brother's pc.
  2. It just closes, no explicit error messages.
  3. Terminal logs:
log ``` C:\Users\AutumnLeaf>set RUST_BACKTRACE=1 C:\Users\AutumnLeaf>set RUST_LOG=trace C:\Users\AutumnLeaf>%localappdata%\vibe\vibe.exe C:\Users\AutumnLeaf>[2024-05-22T13:31:39Z DEBUG vibe_desktop] Vibe App Running [2024-05-22T13:31:39Z DEBUG vibe_desktop::setup] webview version: 125.0.2535.51 [2024-05-22T13:31:39Z TRACE hyper_util::client::legacy::pool] checkout waiting for idle connection: ("https", github.com) [2024-05-22T13:31:39Z DEBUG reqwest::connect] starting new connection: https://github.com/ [2024-05-22T13:31:39Z TRACE hyper_util::client::legacy::connect::http] Http::connect; scheme=Some("https"), host=Some("github.com"), port=None [2024-05-22T13:31:39Z DEBUG hyper_util::client::legacy::connect::dns] resolving host="github.com" [2024-05-22T13:31:39Z DEBUG hyper_util::client::legacy::connect::http] connecting to 140.82.121.4:443 [2024-05-22T13:31:39Z DEBUG hyper_util::client::legacy::connect::http] connected to 140.82.121.4:443 [2024-05-22T13:31:39Z TRACE hyper_util::client::legacy::client] http1 handshake complete, spawning background dispatcher task [2024-05-22T13:31:39Z TRACE hyper_util::client::legacy::pool] checkout dropped for ("https", github.com) [2024-05-22T13:31:40Z TRACE hyper_util::client::legacy::pool] put; add idle connection for ("https", github.com) [2024-05-22T13:31:40Z DEBUG hyper_util::client::legacy::pool] pooling idle connection for ("https", github.com) [2024-05-22T13:31:40Z DEBUG reqwest::async_impl::client] redirecting 'https://github.com/thewh1teagle/vibe/releases/latest/download/latest.json' to 'https://github.com/thewh1teagle/vibe/releases/download/v1.0.7/latest.json' [2024-05-22T13:31:40Z TRACE hyper_util::client::legacy::pool] take? ("https", github.com): expiration = Some(90s) [2024-05-22T13:31:40Z DEBUG hyper_util::client::legacy::pool] reuse idle connection for ("https", github.com) [2024-05-22T13:31:40Z TRACE hyper_util::client::legacy::pool] put; add idle connection for ("https", github.com) [2024-05-22T13:31:40Z DEBUG hyper_util::client::legacy::pool] pooling idle connection for ("https", github.com) [2024-05-22T13:31:40Z DEBUG reqwest::async_impl::client] redirecting 'https://github.com/thewh1teagle/vibe/releases/download/v1.0.7/latest.json' to 'https://objects.githubusercontent.com/github-production-release-asset-2e65be/740293013/c94ac253-eaae-4caa-9ba7-909ec0e6c829?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20240522%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240522T133139Z&X-Amz-Expires=300&X-Amz-Signature=482feea65247fbadc647fa52b521ec24f9253d9d061b3a3110d043615f6dedf0&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=740293013&response-content-disposition=attachment%3B%20filename%3Dlatest.json&response-content-type=application%2Foctet-stream' [2024-05-22T13:31:40Z TRACE hyper_util::client::legacy::pool] checkout waiting for idle connection: ("https", objects.githubusercontent.com) [2024-05-22T13:31:40Z DEBUG reqwest::connect] starting new connection: https://objects.githubusercontent.com/ [2024-05-22T13:31:40Z TRACE hyper_util::client::legacy::connect::http] Http::connect; scheme=Some("https"), host=Some("objects.githubusercontent.com"), port=None [2024-05-22T13:31:40Z DEBUG hyper_util::client::legacy::connect::dns] resolving host="objects.githubusercontent.com" [2024-05-22T13:31:40Z DEBUG hyper_util::client::legacy::connect::http] connecting to 185.199.109.133:443 [2024-05-22T13:31:40Z DEBUG hyper_util::client::legacy::connect::http] connected to 185.199.109.133:443 [2024-05-22T13:31:40Z TRACE hyper_util::client::legacy::client] http1 handshake complete, spawning background dispatcher task [2024-05-22T13:31:40Z TRACE hyper_util::client::legacy::pool] checkout dropped for ("https", objects.githubusercontent.com) [2024-05-22T13:31:40Z TRACE hyper_util::client::legacy::pool] pool dropped, dropping pooled (("https", objects.githubusercontent.com)) [2024-05-22T13:31:49Z WARN tao::platform_impl::platform::event_loop::runner] NewEvents emitted without explicit RedrawEventsCleared [2024-05-22T13:31:49Z WARN tao::platform_impl::platform::event_loop::runner] RedrawEventsCleared emitted without explicit MainEventsCleared [2024-05-22T13:31:52Z DEBUG vibe::model] Transcribe called with { "path": "C:\\Users\\AutumnLeaf\\Downloads\\whisper-bin-x64\\samples_single.wav", "model_path": "C:\\Users\\AutumnLeaf\\AppData\\Local\\github.com.thewh1teagle.vibe\\ggml-medium.bin", "lang": "en", "verbose": false, "n_threads": 4, "init_prompt": "", "temperature": 0.4 } [2024-05-22T13:31:52Z DEBUG vibe::audio] input is C:\Users\AutumnLeaf\Downloads\whisper-bin-x64\samples_single.wav and output is C:\Users\AUTUMN~1\AppData\Local\Temp\.tmp6srujO.wav [2024-05-22T13:31:52Z DEBUG vibe::audio::encoder] decoder channel layout is 0 [2024-05-22T13:31:52Z DEBUG vibe::audio::encoder] +-----------+ | in |default--[16000Hz s16:mono]--Parsed_anull_0:default | (abuffer) | +-----------+ +---------------+ Parsed_anull_0:default--[16000Hz s16:mono]--default| out | | (abuffersink) | +---------------+ +----------------+ in:default--[16000Hz s16:mono]--default| Parsed_anull_0 |default--[16000Hz s16:mono]--out:default | (anull) | +----------------+ [2024-05-22T13:31:52Z DEBUG vibe::audio] wav reader read from "C:\\Users\\AUTUMN~1\\AppData\\Local\\Temp\\.tmp6srujO.wav" [2024-05-22T13:31:52Z DEBUG vibe::audio] parsing C:\Users\AUTUMN~1\AppData\Local\Temp\.tmp6srujO.wav [2024-05-22T13:31:52Z DEBUG vibe::model] open model... [2024-05-22T13:31:52Z INFO whisper_rs::whisper_sys_log] whisper_init_from_file_with_params_no_state: loading model from 'C:\Users\AutumnLeaf\AppData\Local\github.com.thewh1teagle.vibe\ggml-medium.bin' [2024-05-22T13:31:52Z INFO whisper_rs::whisper_sys_log] whisper_model_load: loading model [2024-05-22T13:31:52Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_vocab = 51865 [2024-05-22T13:31:52Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_audio_ctx = 1500 [2024-05-22T13:31:52Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_audio_state = 1024 [2024-05-22T13:31:52Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_audio_head = 16 [2024-05-22T13:31:52Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_audio_layer = 24 [2024-05-22T13:31:52Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_text_ctx = 448 [2024-05-22T13:31:52Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_text_state = 1024 [2024-05-22T13:31:52Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_text_head = 16 [2024-05-22T13:31:52Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_text_layer = 24 [2024-05-22T13:31:52Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_mels = 80 [2024-05-22T13:31:52Z INFO whisper_rs::whisper_sys_log] whisper_model_load: ftype = 1 [2024-05-22T13:31:52Z INFO whisper_rs::whisper_sys_log] whisper_model_load: qntvr = 0 [2024-05-22T13:31:52Z INFO whisper_rs::whisper_sys_log] whisper_model_load: type = 4 (medium) ``` 4-8. Maybe it's some problem with whisper. Here's the terminal log: ``` C:\Users\AutumnLeaf\Downloads\whisper-bin-x64>main.exe -m "%localappdata%\github.com.thewh1teagle.vibe\ggml-medium.bin" -f "samples_single.wav" whisper_init_from_file_with_params_no_state: loading model from 'C:\Users\AutumnLeaf\AppData\Local\github.com.thewh1teagle.vibe\ggml-medium.bin' whisper_init_with_params_no_state: use gpu = 1 whisper_init_with_params_no_state: flash attn = 0 whisper_init_with_params_no_state: gpu_device = 0 whisper_init_with_params_no_state: dtw = 0 whisper_model_load: loading model whisper_model_load: n_vocab = 51865 whisper_model_load: n_audio_ctx = 1500 whisper_model_load: n_audio_state = 1024 whisper_model_load: n_audio_head = 16 whisper_model_load: n_audio_layer = 24 whisper_model_load: n_text_ctx = 448 whisper_model_load: n_text_state = 1024 whisper_model_load: n_text_head = 16 whisper_model_load: n_text_layer = 24 whisper_model_load: n_mels = 80 whisper_model_load: ftype = 1 whisper_model_load: qntvr = 0 whisper_model_load: type = 4 (medium) ``` Event Viewer logs - 2 with Vibe + 2 with Whisper: Vibe errors: Event 1000 - Application Error ``` Faulting application name: vibe.exe, version: 1.0.7.0, time stamp: 0x664c0376 Faulting module name: vibe.exe, version: 1.0.7.0, time stamp: 0x664c0376 Exception code: 0xc000001d Fault offset: 0x0000000000c2d04a Faulting process id: 0x0x29D0 Faulting application start time: 0x0x1DAAC4B7161191E Faulting application path: C:\Users\AutumnLeaf\AppData\Local\vibe\vibe.exe Faulting module path: C:\Users\AutumnLeaf\AppData\Local\vibe\vibe.exe Report Id: a2e17e74-83cc-4613-8109-5dfca9a4da90 Faulting package full name: Faulting package-relative application ID: ``` Followed by Event 1005 - Application Error ``` Windows cannot access the file for one of the following reasons: there is a problem with the network connection, the disk that the file is stored on, or the storage drivers installed on this computer; or the disk is missing. Windows closed the program Vibe because of this error. Program: Vibe File: The error value is listed in the Additional Data section. User Action 1. Open the file again. This situation might be a temporary problem that corrects itself when the program runs again. 2. If the file still cannot be accessed and - It is on the network, your network administrator should verify that there is not a problem with the network and that the server can be contacted. - It is on a removable disk, for example, a floppy disk or CD-ROM, verify that the disk is fully inserted into the computer. 3. Check and repair the file system by running CHKDSK. To run CHKDSK, click Start, click Run, type CMD, and then click OK. At the command prompt, type CHKDSK /F, and then press ENTER. 5. If the problem persists, restore the file from a backup copy. 6. Determine whether other files on the same disk can be opened. If not, the disk might be damaged. If it is a hard disk, contact your administrator or computer hardware vendor for further assistance. Additional Data Error value: 0x0 Disk type: 0x0 ``` Whisper errors: Event 1000 - Application error: ``` Faulting application name: main.exe, version: 0.0.0.0, time stamp: 0x6644b82c Faulting module name: whisper.dll, version: 0.0.0.0, time stamp: 0x6644b81e Exception code: 0xc000001d Fault offset: 0x00000000000a0d24 Faulting process id: 0x0x2164 Faulting application start time: 0x0x1DAAC4BE362C819 Faulting application path: C:\Users\AutumnLeaf\Downloads\whisper-bin-x64\main.exe Faulting module path: C:\Users\AutumnLeaf\Downloads\whisper-bin-x64\whisper.dll Report Id: 16c30353-9eb1-4ef6-ab6c-d32baf747669 Faulting package full name: Faulting package-relative application ID: ``` Followed by Event 1005 - Application Error ``` Windows cannot access the file for one of the following reasons: there is a problem with the network connection, the disk that the file is stored on, or the storage drivers installed on this computer; or the disk is missing. Windows closed the program main.exe because of this error. Program: main.exe File: The error value is listed in the Additional Data section. User Action 1. Open the file again. This situation might be a temporary problem that corrects itself when the program runs again. 2. If the file still cannot be accessed and - It is on the network, your network administrator should verify that there is not a problem with the network and that the server can be contacted. - It is on a removable disk, for example, a floppy disk or CD-ROM, verify that the disk is fully inserted into the computer. 3. Check and repair the file system by running CHKDSK. To run CHKDSK, click Start, click Run, type CMD, and then click OK. At the command prompt, type CHKDSK /F, and then press ENTER. 4. If the problem persists, restore the file from a backup copy. 5. Determine whether other files on the same disk can be opened. If not, the disk might be damaged. If it is a hard disk, contact your administrator or computer hardware vendor for further assistance. Additional Data Error value: 0x0 Disk type: 0x0 ``` Regarding my username: ``` C:\Users\AutumnLeaf\Downloads\whisper-bin-x64>echo %username% AutumnLeaf ```
thewh1teagle commented 1 month ago

Thanks!! I created a bug in whisper.cpp repo. The only missing info is os information with cpu / gpu. can you add it? using the following command in cmd.exe:

winget install neofetch
neofetch
MultipleOneLeaf commented 1 month ago

Just for transparency's sake, on the first result it didn't detect the GPU, but after doing winsat formal it was correctly detected. I then restarted the pc and retried transcribing with Vibe - it still crashes.

Note: while it says Windows 11, the problem also occurred while it was Windows 10.

logs First result: ``` AutumnLeaf@AUTUMNLEAF-PC -------------- OS: Windows 11 Build: 23H2 (22631) Uptime: 0 days, 15 hours, 4 minutes Resolution: 1920x1080 @60Hz Terminal: Command Prompt - neofetch CPU: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz GPU: Unknown (try running WinSAT to fix this) Memory: 10073 MB / 16336 MB (61% in use) Disk: C:\ 228.05 GB (160.45 GB free) ``` Second result: ``` AutumnLeaf@AUTUMNLEAF-PC -------------- OS: Windows 11 Build: 23H2 (22631) Uptime: 0 days, 15 hours, 7 minutes Resolution: 1920x1080 @60Hz Terminal: Command Prompt - neofetch CPU: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz GPU: NVIDIA GeForce RTX 3060 Memory: 10078 MB / 16336 MB (61% in use) Disk: C:\ 228.05 GB (160.44 GB free) ```
thewh1teagle commented 1 month ago

@MultipleOneLeaf I think that it's related to an old issue in whisper.cpp which some CPUs does not support the f16c instruction used by whisper.cpp I compiled vibe with f16c disabled, I think that should solve the crash, at least on some PCs. ~You can download it from vibe_1.0.7_x64-setup_no_f16c.exe~

By the way, you can simply ensure if your cpu supports f16c instruction by running:

set RUST_LOG=vibe=trace
%localappdata%\vibe\vibe.exe

It will print it in the first message logs

MultipleOneLeaf commented 1 month ago

Unfortunately that didn't fix it for me. As can be seen in the log, my cpu supports f16c:

C:\Users\AutumnLeaf>[2024-05-22T19:17:42Z DEBUG vibe_desktop] Vibe App Running
[2024-05-22T19:17:42Z DEBUG vibe_desktop::setup] webview version: 125.0.2535.51
[2024-05-22T19:17:42Z DEBUG vibe_desktop::setup] CPU supports f16c: true
thewh1teagle commented 1 month ago

@MultipleOneLeaf

Per the exception in event viewer you posted, the exception code: 0xc000001d means the program wants from the cpu to do instruction which is unsupported by.

I compiled vibe again with avx2 disabled. ~You can download it from vibe_1.0.7_x64-setup_no_avx2.exe~

Also added many logs to see what features the cpu support You can see by

set RUST_LOG=vibe=trace
%localappdata%\vibe\vibe.exe
MultipleOneLeaf commented 1 month ago

Also didn't work. Here's the log:

logs ``` C:\Users\AutumnLeaf>set RUST_LOG=vibe=trace C:\Users\AutumnLeaf>%localappdata%\vibe\vibe.exe C:\Users\AutumnLeaf>[2024-05-22T22:05:08Z DEBUG vibe_desktop] Vibe App Running [2024-05-22T22:05:08Z DEBUG vibe_desktop::setup] webview version: 125.0.2535.51 [2024-05-22T22:05:08Z DEBUG vibe_desktop::setup] CPU Features: AVX: true AVX2: false AVX512: false AVX512-VBMI: false AVX512-VNNI: false F16C: true [2024-05-22T22:05:13Z DEBUG vibe::model] Transcribe called with { "path": "C:\\Users\\AutumnLeaf\\Downloads\\whisper-bin-x64\\samples_single.wav", "model_path": "C:\\Users\\AutumnLeaf\\AppData\\Local\\github.com.thewh1teagle.vibe\\ggml-medium.bin", "lang": "en", "verbose": false, "n_threads": 4, "init_prompt": "", "temperature": 0.4 } [2024-05-22T22:05:13Z DEBUG vibe::audio] input is C:\Users\AutumnLeaf\Downloads\whisper-bin-x64\samples_single.wav and output is C:\Users\AUTUMN~1\AppData\Local\Temp\.tmpqZioal.wav [2024-05-22T22:05:13Z DEBUG vibe::audio::encoder] decoder channel layout is 0 [2024-05-22T22:05:13Z DEBUG vibe::audio::encoder] +-----------+ | in |default--[16000Hz s16:mono]--Parsed_anull_0:default | (abuffer) | +-----------+ +---------------+ Parsed_anull_0:default--[16000Hz s16:mono]--default| out | | (abuffersink) | +---------------+ +----------------+ in:default--[16000Hz s16:mono]--default| Parsed_anull_0 |default--[16000Hz s16:mono]--out:default | (anull) | +----------------+ [2024-05-22T22:05:13Z DEBUG vibe::audio] wav reader read from "C:\\Users\\AUTUMN~1\\AppData\\Local\\Temp\\.tmpqZioal.wav" [2024-05-22T22:05:13Z DEBUG vibe::audio] parsing C:\Users\AUTUMN~1\AppData\Local\Temp\.tmpqZioal.wav [2024-05-22T22:05:13Z DEBUG vibe::model] open model... ```
thewh1teagle commented 1 month ago

@MultipleOneLeaf

That looks better! As I thought your CPU doesn't support avx2 I believe that for some reason I compiled it incorrectly and avx2 still used

thewh1teagle commented 1 month ago

~Maybe this one [vibe_1.0.7_x64-setup_no_avx_avx2_fma_f16c.exe]~ ~(https://github.com/thewh1teagle/vibe/releases/download/v1.0.7/vibe_1.0.7_x64-setup_no_avx_avx2_fma_f16c.exe) (also added more important line to log)~

~Install then~ ~console~ ~set RUST_LOG=vibe=debug,whisper_rs=debug~ ~%localappdata%\vibe\vibe.exe~ ~~

Updated one with almost all cpu features disabled:

vibe_1.0.7_x64-setup_no_avx_fma_f16c.exe

set RUST_LOG=vibe=debug,whisper_rs=debug
%localappdata%\vibe\vibe.exe
MultipleOneLeaf commented 1 month ago

It works now! Here's the log just in case:

logs ``` C:\Users\AutumnLeaf>set RUST_LOG=vibe=debug,whisper_rs=debug C:\Users\AutumnLeaf>%localappdata%\vibe\vibe.exe C:\Users\AutumnLeaf>[2024-05-23T00:30:25Z DEBUG vibe_desktop] Vibe App Running [2024-05-23T00:30:25Z DEBUG vibe_desktop::setup] webview version: 125.0.2535.51 [2024-05-23T00:30:25Z DEBUG vibe_desktop::setup] CPU Features: AVX: true AVX2: false AVX512: false AVX512-VBMI: false AVX512-VNNI: false FMA: false F16C: true [2024-05-23T00:30:25Z DEBUG vibe_desktop::setup] COMMIT_HASH: cc1bb4f721a794c58081b139e57808c6ec8e00a5 [2024-05-23T00:30:30Z DEBUG vibe::model] Transcribe called with { "path": "C:\\Users\\AutumnLeaf\\Downloads\\whisper-bin-x64\\samples_single.wav", "model_path": "C:\\Users\\AutumnLeaf\\AppData\\Local\\github.com.thewh1teagle.vibe\\ggml-medium.bin", "lang": "en", "verbose": false, "n_threads": 4, "init_prompt": "", "temperature": 0.4 } [2024-05-23T00:30:30Z DEBUG vibe::audio] input is C:\Users\AutumnLeaf\Downloads\whisper-bin-x64\samples_single.wav and output is C:\Users\AUTUMN~1\AppData\Local\Temp\.tmpHz2Pyh.wav [2024-05-23T00:30:30Z DEBUG vibe::audio::encoder] decoder channel layout is 0 [2024-05-23T00:30:30Z DEBUG vibe::audio::encoder] +-----------+ | in |default--[16000Hz s16:mono]--Parsed_anull_0:default | (abuffer) | +-----------+ +---------------+ Parsed_anull_0:default--[16000Hz s16:mono]--default| out | | (abuffersink) | +---------------+ +----------------+ in:default--[16000Hz s16:mono]--default| Parsed_anull_0 |default--[16000Hz s16:mono]--out:default | (anull) | +----------------+ [2024-05-23T00:30:30Z DEBUG vibe::audio] wav reader read from "C:\\Users\\AUTUMN~1\\AppData\\Local\\Temp\\.tmpHz2Pyh.wav" [2024-05-23T00:30:31Z DEBUG vibe::audio] parsing C:\Users\AUTUMN~1\AppData\Local\Temp\.tmpHz2Pyh.wav [2024-05-23T00:30:31Z DEBUG vibe::model] open model... [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_init_from_file_with_params_no_state: loading model from 'C:\Users\AutumnLeaf\AppData\Local\github.com.thewh1teagle.vibe\ggml-medium.bin' [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_model_load: loading model [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_vocab = 51865 [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_audio_ctx = 1500 [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_audio_state = 1024 [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_audio_head = 16 [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_audio_layer = 24 [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_text_ctx = 448 [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_text_state = 1024 [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_text_head = 16 [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_text_layer = 24 [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_mels = 80 [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_model_load: ftype = 1 [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_model_load: qntvr = 0 [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_model_load: type = 4 (medium) [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_model_load: adding 1608 extra tokens [2024-05-23T00:30:31Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_langs = 99 [2024-05-23T00:30:35Z INFO whisper_rs::whisper_sys_log] whisper_model_load: CPU total size = 1533.14 MB [2024-05-23T00:30:37Z INFO whisper_rs::whisper_sys_log] whisper_model_load: model size = 1533.14 MB [2024-05-23T00:30:37Z INFO whisper_rs::whisper_sys_log] whisper_init_state: kv self size = 132.12 MB [2024-05-23T00:30:37Z INFO whisper_rs::whisper_sys_log] whisper_init_state: kv cross size = 147.46 MB [2024-05-23T00:30:37Z INFO whisper_rs::whisper_sys_log] whisper_init_state: compute buffer (conv) = 28.68 MB [2024-05-23T00:30:37Z INFO whisper_rs::whisper_sys_log] whisper_init_state: compute buffer (encode) = 594.22 MB [2024-05-23T00:30:37Z INFO whisper_rs::whisper_sys_log] whisper_init_state: compute buffer (cross) = 7.85 MB [2024-05-23T00:30:37Z INFO whisper_rs::whisper_sys_log] whisper_init_state: compute buffer (decode) = 138.87 MB [2024-05-23T00:30:37Z DEBUG vibe::model] set language to Some("en") [2024-05-23T00:30:37Z DEBUG vibe::model] setting temperature to 0.4 [2024-05-23T00:30:37Z DEBUG vibe::model] setting init prompt to [2024-05-23T00:30:37Z DEBUG vibe::model] setting n threads to 4 [2024-05-23T00:30:37Z DEBUG vibe::model] set start time... [2024-05-23T00:30:37Z DEBUG vibe::model] setting state full... [2024-05-23T00:30:37Z DEBUG vibe::model] progress callback 0 [2024-05-23T00:30:37Z DEBUG vibe_desktop::cmd] set_progress_bar 0 [2024-05-23T00:30:56Z DEBUG vibe::model] progress callback 100 [2024-05-23T00:30:56Z DEBUG vibe_desktop::cmd] set_progress_bar 100 [2024-05-23T00:30:56Z DEBUG vibe::model] getting segments count... [2024-05-23T00:30:56Z DEBUG vibe::model] found 1 segments [2024-05-23T00:30:56Z DEBUG vibe::model] looping segments... ```

Should I close the issue or is there anything else you'd like me to test?

thewh1teagle commented 1 month ago

Amazing! Thanks for your patience :)

I think I'll disable automatic updates for computers that don't support those instructions (avx, fma, etc.). Also, instead of letting whisper.cpp crash the whole program, I'll just print an error with a link to download this working executable.

thewh1teagle commented 1 month ago

Should I close the issue or is there anything else you'd like me to test?

Let's wait to see if others have suggestions.

It's interesting what the speed that you get when transcribing without those instructions (which related to optimization) You can see the time it takes in the console after finish transcribing (right click -> inspect element)

MultipleOneLeaf commented 1 month ago

Amazing! Thanks for your patience :)

Thank you!

speed that you get when transcribing

On the sample you mentioned above (11 seconds of sound), it took 19 seconds.

thewh1teagle commented 1 month ago

Amazing! Thanks for your patience :)

Thank you!

speed that you get when transcribing

On the sample you mentioned above (11 seconds of sound), it took 19 seconds.

Meaning that one hour audio will take 1 hour and 30-40 minutes. on my CPU with optimization (no gpu) it will take 1 hour and 20-25 minutes. With your NVIDIA GeForce RTX 3060 it should be much faster. does the task manager indicate it uses the gpu while transcribing?

Update

I released new installer for Nvidia GPUs! If you interested you can try from vibe_1.0.7_x64-setup_nvidia_no_avx2_fma.exe

It's pretty heavy (300-500mb) though you don't have to install it. you can extract it with 7-zip and start vibe.exe directly. Can't test it since I don't have Nvidia GPU, so I hope it will work :)

MultipleOneLeaf commented 1 month ago

Just tested with my GPU. I extracted in case it ends up making any difference in speed.

The sample file took 3 seconds.

I also tried then transcribing a file with 20 minutes duration - though probably only about 10 minutes worth of speech - and it took 89-90 seconds. Doing it with a different language (the one in this specific file, instead of English) it took 96 seconds.

The time with CPU, from the experiment my brother did with Ryzen 5 2600 before I opened this issue, was about 40 minutes I believe.

thewh1teagle commented 1 month ago

That's amazing! It's nearly as fast as high-quality transcriptions from paid APIs. Now, anyone with Nvidia GPUs can easily turn their PC into a high-quality transcription machine :)

MultipleOneLeaf commented 1 month ago

Yup. These results might be a good form of advertisement to increase interest in your project.

And, in my opinion, the best part is that it is also run locally.

Thank you once again for you efforts!

manav0619 commented 1 month ago

Hello. I am using an RTX 4070 Ti Super with 7800X3D. Do I also need to install the CUDA Toolkit to use this Nvidia version you have linked? https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64&target_version=11&target_type=exe_local

I tried the default version of Vibe, but it ran on the CPU when transcribing. However, the new Nvidia version you have linked in this reply just crashes a few seconds after clicking transcribe.

thewh1teagle commented 1 month ago

@manav0619 If your GPU is Nvidia, you might not need to install the CUDA Toolkit. It's better to troubleshoot why it's crashing. Could you check DEBUG.md and share the details here?

In the section of trying original whisper you can try the following version: whisper-cublas-12.2.0-bin-x64.zip

manav0619 commented 1 month ago

@thewh1teagle Thank you for the quick response, and apologies for being late. I am pasting below the details of the two events from Event Viewer related to the latest crash. Please let me know if you require anything else.

image

log ``` Log Name: Application Source: Windows Error Reporting Date: 24-05-2024 22:56:38 Event ID: 1001 Task Category: None Level: Information Keywords: User: MANAV\manav Computer: Manav Description: Fault bucket 1174612636899832668, type 5 Event Name: BEX64 Response: Not available Cab Id: 0 Problem signature: P1: vibe.exe P2: 1.0.7.0 P3: 664ea5c9 P4: ucrtbase.dll P5: 10.0.22621.3593 P6: 10c46e71 P7: 000000000007f6fe P8: c0000409 P9: 0000000000000007 P10: Attached files: \\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER.21288926-428d-4704-aab6-7391182e9765.tmp.dmp \\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER.c34cc256-4d6b-4c2a-8f7b-14dbd8e2404d.tmp.WERInternalMetadata.xml \\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER.fc41f8e1-e6f1-4233-9e43-a94fb6b70168.tmp.csv \\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER.6531543f-492f-4f63-8a73-d9112d8131d2.tmp.txt \\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER.54b29807-f0d6-4538-9922-43dd4ad0452f.tmp.xml These files may be available here: \\?\C:\ProgramData\Microsoft\Windows\WER\ReportArchive\AppCrash_vibe.exe_4e17b63542f0a14447376ec6afa4d1f3fb1bca_eade4aff_81a2193e-4e71-4216-b4ef-defd592c17e6 Analysis symbol: Rechecking for solution: 0 Report Id: f390c383-a607-4a96-a3a5-3a9db346359b Report Status: 268435456 Hashed bucket: 205d49a766ef9f8be04d0ff84b19735c Cab Guid: 0 Event Xml: 1001 0 4 0 0 0x8000000000000000 4109 Application Manav 1174612636899832668 5 BEX64 Not available 0 vibe.exe 1.0.7.0 664ea5c9 ucrtbase.dll 10.0.22621.3593 10c46e71 000000000007f6fe c0000409 0000000000000007 \\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER.21288926-428d-4704-aab6-7391182e9765.tmp.dmp \\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER.c34cc256-4d6b-4c2a-8f7b-14dbd8e2404d.tmp.WERInternalMetadata.xml \\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER.fc41f8e1-e6f1-4233-9e43-a94fb6b70168.tmp.csv \\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER.6531543f-492f-4f63-8a73-d9112d8131d2.tmp.txt \\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER.54b29807-f0d6-4538-9922-43dd4ad0452f.tmp.xml \\?\C:\ProgramData\Microsoft\Windows\WER\ReportArchive\AppCrash_vibe.exe_4e17b63542f0a14447376ec6afa4d1f3fb1bca_eade4aff_81a2193e-4e71-4216-b4ef-defd592c17e6 0 f390c383-a607-4a96-a3a5-3a9db346359b 268435456 205d49a766ef9f8be04d0ff84b19735c 0 Log Name: Application Source: Application Error Date: 24-05-2024 22:56:35 Event ID: 1000 Task Category: Application Crashing Events Level: Error Keywords: User: MANAV\manav Computer: Manav Description: Faulting application name: vibe.exe, version: 1.0.7.0, time stamp: 0x664ea5c9 Faulting module name: ucrtbase.dll, version: 10.0.22621.3593, time stamp: 0x10c46e71 Exception code: 0xc0000409 Fault offset: 0x000000000007f6fe Faulting process id: 0x0x2308 Faulting application start time: 0x0x1DAADFF6F0D0267 Faulting application path: C:\Users\manav\AppData\Local\vibe\vibe.exe Faulting module path: C:\Windows\System32\ucrtbase.dll Report Id: f390c383-a607-4a96-a3a5-3a9db346359b Faulting package full name: Faulting package-relative application ID: Event Xml: 1000 0 2 100 0 0x8000000000000000 4108 Application Manav vibe.exe 1.0.7.0 664ea5c9 ucrtbase.dll 10.0.22621.3593 10c46e71 c0000409 000000000007f6fe 0x2308 0x1daadff6f0d0267 C:\Users\manav\AppData\Local\vibe\vibe.exe C:\Windows\System32\ucrtbase.dll f390c383-a607-4a96-a3a5-3a9db346359b ```
thewh1teagle commented 1 month ago

@manav0619 Not sure why it happens from the log of event viewer, maybe try to get the logs from

instructions a. Open `cmd.exe` b. Execute: ```console set RUST_BACKTRACE=1 set RUST_LOG=vibe=debug,whisper_rs=debug %localappdata%\vibe\vibe.exe ``` Also: 1. Open `cmd.exe` 2. Execute the following ```console winget install neofetch neofetch ```
manav0619 commented 1 month ago

@thewh1teagle

logs ``` Microsoft Windows [Version 10.0.22631.3593] (c) Microsoft Corporation. All rights reserved. C:\Users\manav>%localappdata%\vibe\vibe.exe C:\Users\manav>set RUST_BACKTRACE=1 C:\Users\manav>set RUST_LOG=vibe=debug,whisper_rs=debug C:\Users\manav>%localappdata%\vibe\vibe.exe C:\Users\manav>[2024-05-24T21:33:55Z DEBUG vibe_desktop] Vibe App Running [2024-05-24T21:33:55Z DEBUG vibe_desktop::setup] webview version: 124.0.2478.97 [2024-05-24T21:33:55Z DEBUG vibe_desktop::setup] CPU Features: AVX: true AVX2: true AVX512: true AVX512-VBMI: true AVX512-VNNI: true FMA: true F16C: true [2024-05-24T21:33:55Z DEBUG vibe_desktop::setup] COMMIT_HASH: cc1bb4f721a794c58081b139e57808c6ec8e00a5 [2024-05-24T21:34:10Z DEBUG vibe::model] Transcribe called with { "path": "C:\\Users\\manav\\Desktop\\Audios\\samples_single.wav", "model_path": "C:\\Users\\manav\\AppData\\Local\\github.com.thewh1teagle.vibe\\ggml-large-v3.bin", "lang": "en", "verbose": false, "n_threads": 4, "init_prompt": "", "temperature": 0.4 } [2024-05-24T21:34:10Z DEBUG vibe::audio] input is C:\Users\manav\Desktop\Audios\samples_single.wav and output is C:\Users\manav\AppData\Local\Temp\.tmpflC3KN.wav [2024-05-24T21:34:10Z DEBUG vibe::audio::encoder] decoder channel layout is 0 [2024-05-24T21:34:10Z DEBUG vibe::audio::encoder] +-----------+ | in |default--[16000Hz s16:mono]--Parsed_anull_0:default | (abuffer) | +-----------+ +---------------+ Parsed_anull_0:default--[16000Hz s16:mono]--default| out | | (abuffersink) | +---------------+ +----------------+ in:default--[16000Hz s16:mono]--default| Parsed_anull_0 |default--[16000Hz s16:mono]--out:default | (anull) | +----------------+ [2024-05-24T21:34:10Z DEBUG vibe::audio] wav reader read from "C:\\Users\\manav\\AppData\\Local\\Temp\\.tmpflC3KN.wav" [2024-05-24T21:34:10Z DEBUG vibe::audio] parsing C:\Users\manav\AppData\Local\Temp\.tmpflC3KN.wav [2024-05-24T21:34:10Z DEBUG vibe::model] open model... [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_init_from_file_with_params_no_state: loading model from 'C:\Users\manav\AppData\Local\github.com.thewh1teagle.vibe\ggml-large-v3.bin' [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: loading model [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_vocab = 51866 [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_audio_ctx = 1500 [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_audio_state = 1280 [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_audio_head = 20 [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_audio_layer = 32 [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_text_ctx = 448 [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_text_state = 1280 [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_text_head = 20 [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_text_layer = 32 [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_mels = 128 [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: ftype = 1 [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: qntvr = 0 [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: type = 5 (large v3) [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: adding 1609 extra tokens [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_langs = 100 [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_backend_init: using CUDA backend [2024-05-24T21:34:10Z INFO whisper_rs::whisper_sys_log] whisper_model_load: CUDA0 total size = 3094.36 MB [2024-05-24T21:34:41Z INFO whisper_rs::whisper_sys_log] whisper_model_load: model size = 3094.36 MB [2024-05-24T21:34:41Z INFO whisper_rs::whisper_sys_log] whisper_backend_init: using CUDA backend [2024-05-24T21:34:41Z INFO whisper_rs::whisper_sys_log] whisper_init_state: kv self size = 220.20 MB [2024-05-24T21:34:41Z INFO whisper_rs::whisper_sys_log] whisper_init_state: kv cross size = 245.76 MB [2024-05-24T21:34:41Z INFO whisper_rs::whisper_sys_log] whisper_init_state: compute buffer (conv) = 36.26 MB [2024-05-24T21:34:41Z INFO whisper_rs::whisper_sys_log] whisper_init_state: compute buffer (encode) = 926.66 MB [2024-05-24T21:34:41Z INFO whisper_rs::whisper_sys_log] whisper_init_state: compute buffer (cross) = 9.38 MB [2024-05-24T21:34:41Z INFO whisper_rs::whisper_sys_log] whisper_init_state: compute buffer (decode) = 209.26 MB [2024-05-24T21:34:41Z DEBUG vibe::model] set language to Some("en") [2024-05-24T21:34:41Z DEBUG vibe::model] setting temperature to 0.4 [2024-05-24T21:34:41Z DEBUG vibe::model] setting init prompt to [2024-05-24T21:34:41Z DEBUG vibe::model] setting n threads to 4 [2024-05-24T21:34:41Z DEBUG vibe::model] set start time... [2024-05-24T21:34:41Z DEBUG vibe::model] setting state full... [2024-05-24T21:34:41Z DEBUG vibe::model] progress callback 0 [2024-05-24T21:34:41Z DEBUG vibe_desktop::cmd] set_progress_bar 0 ``` ``` Downloading https://github.com/nepnep39/neofetch-win/releases/download/v1.2.1/neofetch121.msi ██████████████████████████████ 734 KB / 734 KB Successfully verified installer hash Starting package install... Successfully installed PS C:\Users\manav> neofetch llllllllllllllll llllllllllllllll manav@MANAV llllllllllllllll llllllllllllllll -------------- llllllllllllllll llllllllllllllll OS: Windows 11 llllllllllllllll llllllllllllllll Build: 23H2 (22631) llllllllllllllll llllllllllllllll Uptime: 0 days, 0 hours, 6 minutes llllllllllllllll llllllllllllllll Resolution: 2560x1440 @180Hz llllllllllllllll llllllllllllllll Terminal: Administrator: Windows PowerShell llllllllllllllll llllllllllllllll CPU: AMD Ryzen 7 7800X3D 8-Core Processor llllllllllllllll llllllllllllllll GPU: Unknown (try running WinSAT to fix this) Memory: 9667 MB / 31893 MB (30% in use) llllllllllllllll llllllllllllllll Disk: C:\ 1906.85 GB (1257.77 GB free) llllllllllllllll llllllllllllllll llllllllllllllll llllllllllllllll Mem%: -=[ //////////////////// ]=- llllllllllllllll llllllllllllllll llllllllllllllll llllllllllllllll Disk%: -=[ //////////////////// ]=- llllllllllllllll llllllllllllllll llllllllllllllll llllllllllllllll llllllllllllllll llllllllllllllll ```
thewh1teagle commented 1 month ago

@manav0619

Looks like the error happens in whisper.cpp engine when it tries to load the model. Maybe you do need to install cuda toolkit.

Try to download cudnn (direct download link) And CUDA Toolkit (direct download link) Also download newer version of Vibe vibe_1.0.7_x64-setup_nvidia.exe

If you didn't updated the driver of your GPU then also download rtx-4070_ti_super-studio-driver (direct link)

manav0619 commented 1 month ago

@thewh1teagle I installed both CUDA Toolkit and cudnn, but it's still crashing.

log C:\Users\manav>[2024-05-25T00:24:58Z DEBUG vibe_desktop] Vibe App Running [2024-05-25T00:24:58Z DEBUG vibe_desktop::setup] webview version: 124.0.2478.97 [2024-05-25T00:24:58Z DEBUG vibe_desktop::setup] CPU Features: AVX: true AVX2: true AVX512: true AVX512-VBMI: true AVX512-VNNI: true FMA: true F16C: true [2024-05-25T00:24:58Z DEBUG vibe_desktop::setup] COMMIT_HASH: cac8c0b522888a128e544f2a7d95d3f016ae30c2 [2024-05-25T00:25:06Z DEBUG vibe::model] Transcribe called with { "path": "C:\\Users\\manav\\Desktop\\Audios\\samples_single.wav", "model_path": "C:\\Users\\manav\\AppData\\Local\\github.com.thewh1teagle.vibe\\ggml-large-v3.bin", "lang": "en", "verbose": false, "n_threads": 4, "init_prompt": "", "temperature": 0.4 } [2024-05-25T00:25:06Z DEBUG vibe::audio] input is C:\Users\manav\Desktop\Audios\samples_single.wav and output is C:\Users\manav\AppData\Local\Temp\.tmpAyu2jE.wav [2024-05-25T00:25:06Z DEBUG vibe::audio::encoder] decoder channel layout is 0 [2024-05-25T00:25:06Z DEBUG vibe::audio::encoder] +-----------+ | in |default--[16000Hz s16:mono]--Parsed_anull_0:default | (abuffer) | +-----------+ +---------------+ Parsed_anull_0:default--[16000Hz s16:mono]--default| out | | (abuffersink) | +---------------+ +----------------+ in:default--[16000Hz s16:mono]--default| Parsed_anull_0 |default--[16000Hz s16:mono]--out:default | (anull) | +----------------+ [2024-05-25T00:25:06Z DEBUG vibe::audio] wav reader read from "C:\\Users\\manav\\AppData\\Local\\Temp\\.tmpAyu2jE.wav" [2024-05-25T00:25:06Z DEBUG vibe::audio] parsing C:\Users\manav\AppData\Local\Temp\.tmpAyu2jE.wav [2024-05-25T00:25:06Z DEBUG vibe::model] open model... [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_init_from_file_with_params_no_state: loading model from 'C:\Users\manav\AppData\Local\github.com.thewh1teagle.vibe\ggml-large-v3.bin' [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_model_load: loading model [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_vocab = 51866 [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_audio_ctx = 1500 [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_audio_state = 1280 [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_audio_head = 20 [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_audio_layer = 32 [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_text_ctx = 448 [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_text_state = 1280 [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_text_head = 20 [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_text_layer = 32 [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_mels = 128 [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_model_load: ftype = 1 [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_model_load: qntvr = 0 [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_model_load: type = 5 (large v3) [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_model_load: adding 1609 extra tokens [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_model_load: n_langs = 100 [2024-05-25T00:25:06Z INFO whisper_rs::whisper_sys_log] whisper_backend_init: using CUDA backend [2024-05-25T00:25:07Z INFO whisper_rs::whisper_sys_log] whisper_model_load: CUDA0 total size = 3094.36 MB [2024-05-25T00:25:08Z INFO whisper_rs::whisper_sys_log] whisper_model_load: model size = 3094.36 MB [2024-05-25T00:25:08Z INFO whisper_rs::whisper_sys_log] whisper_backend_init: using CUDA backend [2024-05-25T00:25:08Z INFO whisper_rs::whisper_sys_log] whisper_init_state: kv self size = 220.20 MB [2024-05-25T00:25:08Z INFO whisper_rs::whisper_sys_log] whisper_init_state: kv cross size = 245.76 MB [2024-05-25T00:25:08Z INFO whisper_rs::whisper_sys_log] whisper_init_state: compute buffer (conv) = 36.26 MB [2024-05-25T00:25:08Z INFO whisper_rs::whisper_sys_log] whisper_init_state: compute buffer (encode) = 926.66 MB [2024-05-25T00:25:08Z INFO whisper_rs::whisper_sys_log] whisper_init_state: compute buffer (cross) = 9.38 MB [2024-05-25T00:25:08Z INFO whisper_rs::whisper_sys_log] whisper_init_state: compute buffer (decode) = 209.26 MB [2024-05-25T00:25:08Z DEBUG vibe::model] set language to Some("en") [2024-05-25T00:25:08Z DEBUG vibe::model] setting temperature to 0.4 [2024-05-25T00:25:08Z DEBUG vibe::model] setting init prompt to [2024-05-25T00:25:08Z DEBUG vibe::model] setting n threads to 4 [2024-05-25T00:25:08Z DEBUG vibe::model] set start time... [2024-05-25T00:25:08Z DEBUG vibe::model] setting state full... [2024-05-25T00:25:08Z DEBUG vibe::model] progress callback 0 [2024-05-25T00:25:08Z DEBUG vibe_desktop::cmd] set_progress_bar 0

I would also like to mention that another tool named Whisper GUI works fine and does use the GPU. I transcribed two files with it even before installing Vibe and CUDA, and had no problems. Edit: Just now, I tried the v3 model on Faster Whisper GUI to transcribe an hour-long file, which is also working fine and used 100% of the GPU.

thewh1teagle commented 1 month ago

@manav0619 Looks like previously it happened in Whisper Faster too. try to open the advanced options in main window and set the temperature to 0.

thewh1teagle commented 1 month ago

Reported in whisper.cpp/issues/2187

manav0619 commented 1 month ago

@thewh1teagle Hello. I believe there was an update in the application, which I let it install. It says vibe 1.0.9 in the settings. I am not sure if this is still the Nvidia-optimized version. But anyway, I am able to transcribe the files in this version, independent of the temperature setting. This time, it seems to be utilizing both the CPU and GPU.

The process is still much slower compared to the Faster Whisper GUI I linked above, which utilizes 90+% of the GPU, with GPU temperature remaining at 50-53 °C. Vibe's GPU usage is fluctuating around 70-80%, with GPU temperature remaining at 47-48 °C.

Both of these screenshots are when transcribing the same 70-minute audio file using Whisper large-v3. Vibe took almost half an hour. Faster Whisper GUI was done in around 2 minutes. image image

thewh1teagle commented 1 month ago

Great to know that at least you it doesn't crash anymore! By default vibe uses opencl GPU optimization, which is pretty good in general, but not close to Nvidia. The nvidia builds very heavy, that's why I don't include them by default. but I compiled new version of Vibe for 1.0.9 with Nvidia optimization enabled. It should work much faster, should be even close to faster-whisper Try it from vibe_1.0.9_x64-setup_nvidia_whisper_1.6.2.exe

thewh1teagle commented 1 month ago

I added checks if the cpu is supported and if not vibe will display error message with instructions how to fix. In addition there's new release 2.0 with nvidia (see the readme of vibe)