I quantized the medium english model locally using Q5_0 and upon loading the model from a file using whisper.spm, I get an error with the model type: whisper/ggml.c:4213: wtype != GGML_TYPE_COUNT. Below is the function it originates from:
enum ggml_type ggml_ftype_to_ggml_type(enum ggml_ftype ftype) {
enum ggml_type wtype = GGML_TYPE_COUNT;
switch (ftype) {
case GGML_FTYPE_ALL_F32: wtype = GGML_TYPE_F32; break;
case GGML_FTYPE_MOSTLY_F16: wtype = GGML_TYPE_F16; break;
case GGML_FTYPE_MOSTLY_Q4_0: wtype = GGML_TYPE_Q4_0; break;
case GGML_FTYPE_MOSTLY_Q4_1: wtype = GGML_TYPE_Q4_1; break;
case GGML_FTYPE_MOSTLY_Q4_2: wtype = GGML_TYPE_Q4_2; break;
case GGML_FTYPE_MOSTLY_Q5_0: wtype = GGML_TYPE_Q5_0; break;
case GGML_FTYPE_MOSTLY_Q5_1: wtype = GGML_TYPE_Q5_1; break;
case GGML_FTYPE_MOSTLY_Q8_0: wtype = GGML_TYPE_Q8_0; break;
case GGML_FTYPE_UNKNOWN: wtype = GGML_TYPE_COUNT; break;
case GGML_FTYPE_MOSTLY_Q4_1_SOME_F16: wtype = GGML_TYPE_COUNT; break;
}
GGML_ASSERT(wtype != GGML_TYPE_COUNT);
return wtype;
}
I've updated to the latest commit on the master branch which should have support for quantized models - are there any extra build steps for running quantized models? I assumed I can just swap out the files.
I quantized the medium english model locally using Q5_0 and upon loading the model from a file using whisper.spm, I get an error with the model type:
whisper/ggml.c:4213: wtype != GGML_TYPE_COUNT
. Below is the function it originates from:I've updated to the latest commit on the master branch which should have support for quantized models - are there any extra build steps for running quantized models? I assumed I can just swap out the files.