ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
35.58k stars 3.62k forks source link

Medium and large model stucked #820

Open emanueleielo opened 1 year ago

emanueleielo commented 1 year ago

Hi, actually I started the whisper.cpp on windows 10 These are some system info of my computer: image and this is the output of the console: image After this output process is closed (3-6seconds). I can see in my performance tool of windows that the memory start to be allocated and next the running crash

Some info about it?

BPOH commented 1 year ago

What is the file length? I noticed that it transcribes poorly for more than 20 minutes, today I divided the file for 15 minutes with a two-hour bashscript, everything works fine

emanueleielo commented 1 year ago

What is the file length? I noticed that it transcribes poorly for more than 20 minutes, today I divided the file for 15 minutes with a two-hour bashscript, everything works fine

@BPOH

What system are u using? Windows? Can u tell me about ur system info?

My audio is very long but I tried also with a small one

BPOH commented 1 year ago

OS: Artix Linux x86_64 Kernel: 6.2.11-artix1-1 WM: i3 CPU: 12th Gen Intel i7-12700H (20) @ 4.600GHz GPU: Intel Alder Lake-P Memory: 3976MiB / 15720MiB

I don’t think it’s the system, it’s right there with ++ without dependencies, it’s just that your file is long or you don’t convert it to wav as indicated in the documentation, my observation was that if I submit a file for more than 20 minutes I get problems

emanueleielo commented 1 year ago

OS: Artix Linux x86_64 Kernel: 6.2.11-artix1-1 WM: i3 CPU: 12th Gen Intel i7-12700H (20) @ 4.600GHz GPU: Intel Alder Lake-P Memory: 3976MiB / 15720MiB

I don’t think it’s the system, it’s right there with ++ without dependencies, it’s just that your file is long or you don’t convert it to wav as indicated in the documentation, my observation was that if I submit a file for more than 20 minutes I get problems

@BPOH Mh but actually with the tiny or base model I can launch the inference

BPOH commented 1 year ago

Today I used ggml-medium.bin to transcribe a two hour file Well, like a two-hour clock, broken into 15 minutes ...

If you need to transcribe English, then you do not need to use large models, I translated Russian, and small models do not work with it

emanueleielo commented 1 year ago

Today I used ggml-medium.bin to transcribe a two hour file Well, like a two-hour clock, broken into 15 minutes ...

If you need to transcribe English, then you do not need to use large models, I translated Russian, and small models do not work with it

I’m using it pretty much for research, I would like to understand the limitation of the large one for production porpuse

BPOH commented 1 year ago

In windows, you can try https://github.com/Const-me/Whisper I just tried to run it, it worked, I checked it on a medium and large model, but somehow it seemed that longer than on cpu you have a better video card than mine, it can work faster.

emanueleielo commented 1 year ago

Thank u! I will try it for sure

04041b commented 1 year ago

Same problem here. I believe you have done the same thing as me, which is download the wrong version. The release file name is extremely confusing. You should download whisper-bin-x64.zip if you are using x64 system. I am able to solve this by change to the correct version.