runs perfectly with the regular models, but not the quantized ones

ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++

MIT License

33.13k stars 3.32k forks source link

runs perfectly with the regular models, but not the quantized ones #993

Open patrickfconnolly opened 1 year ago

patrickfconnolly commented 1 year ago

I have run quantization on ggml-small.en.bin to produce ggml-small.en.bin-q5_0.bin Quantization proceeded without any errors.

When I run the model, it attempts to load, but throws the following: GGML_ASSERT: ggml.c:4288: wtype != GGML_TYPE_COUNT Abort trap: 6

Same issue occurs when I try 8-bit quantization.

Running on a 2015 Macbook Air, if this is any use.

fieldplower commented 1 year ago

Ye same thing over here, sad hours

StaffanBetner commented 1 year ago

I get the same error. I use Windows 10 with this version: whisper-bin-x64.zip

>main.exe -f input.wav -l auto -m models\ggml-model-whisper-small.en-q5_1.bin
whisper_init_from_file_no_state: loading model from 'models\ggml-model-whisper-small.en-q5_1.bin'
whisper_model_load: loading model
GGML_ASSERT: D:\a\whisper.cpp\whisper.cpp\ggml.c:4213: wtype != GGML_TYPE_COUNT

ggerganov commented 1 year ago

What happens if you build from source using latest master Make sure to make clean first

teddybear082 commented 1 year ago

Will there be a new official release to support the breaking change to the ggml models that occurred since April 30 by any chance? I get the same error as StaffanBetner with the 1.4.0 windows 64 exe release and downloading the q5 model from ggml.ggerganov so I assume this is the reason. Thanks for all your work on this project! Its amazing.

StaffanBetner commented 1 year ago

What happens if you build from source using latest master Make sure to make clean first

I downloaded 1.4.2 from here: https://github.com/ggerganov/whisper.cpp/actions/runs/4973278607 and it works with the quantized models. Maybe it is a good idea to highlight in the readme that there are autocompiled versions under the action tab.

teddybear082 commented 1 year ago

Maybe it is a good idea to highlight in the readme that there are autocompiled versions under the action tab

Wow thanks, yes, I had no idea! appreciate you letting us know.

tazz4843 commented 1 year ago

Can reproduce this using CuBLAS on a RTX 3060 using v1.4.2. I don't want to use master if possible, so a new release to wrap up any changes since would be appreciated.

[zero@archlinux stt-service-whisper]$ ./target/release/scripty_stt_service /opt/whisper_models/ggml-base-q4_0.bin 4
2023-07-01T23:52:15.291328Z  INFO scripty_stt_service: loading models
2023-07-01T23:52:15.291340Z  INFO stts_speech_to_text: attempting to load model
whisper_init_from_file_no_state: loading model from '/opt/whisper_models/ggml-base-q4_0.bin'
whisper_model_load: loading model
GGML_ASSERT: /home/zero/.cargo/git/checkouts/whisper-rs-c8b9fa7c09c8c913/26c5974/sys/whisper.cpp/ggml.c:4213: wtype != GGML_TYPE_COUNT
Aborted (core dumped)

ggerganov commented 1 year ago

Will probably make a new release after merging https://github.com/ggerganov/whisper.cpp/pull/1058