ggerganov / whisper.spm

whisper.cpp package for the Swift Package Manager
MIT License
168 stars 27 forks source link

Quantized models: wtype != GGML_TYPE_COUNT #12

Closed vizaiapp closed 1 year ago

vizaiapp commented 1 year ago

I quantized the medium english model locally using Q5_0 and upon loading the model from a file using whisper.spm, I get an error with the model type: whisper/ggml.c:4213: wtype != GGML_TYPE_COUNT. Below is the function it originates from:

enum ggml_type ggml_ftype_to_ggml_type(enum ggml_ftype ftype) {
    enum ggml_type wtype = GGML_TYPE_COUNT;

    switch (ftype) {
        case GGML_FTYPE_ALL_F32:              wtype = GGML_TYPE_F32;   break;
        case GGML_FTYPE_MOSTLY_F16:           wtype = GGML_TYPE_F16;   break;
        case GGML_FTYPE_MOSTLY_Q4_0:          wtype = GGML_TYPE_Q4_0;  break;
        case GGML_FTYPE_MOSTLY_Q4_1:          wtype = GGML_TYPE_Q4_1;  break;
        case GGML_FTYPE_MOSTLY_Q4_2:          wtype = GGML_TYPE_Q4_2;  break;
        case GGML_FTYPE_MOSTLY_Q5_0:          wtype = GGML_TYPE_Q5_0;  break;
        case GGML_FTYPE_MOSTLY_Q5_1:          wtype = GGML_TYPE_Q5_1;  break;
        case GGML_FTYPE_MOSTLY_Q8_0:          wtype = GGML_TYPE_Q8_0;  break;
        case GGML_FTYPE_UNKNOWN:              wtype = GGML_TYPE_COUNT; break;
        case GGML_FTYPE_MOSTLY_Q4_1_SOME_F16: wtype = GGML_TYPE_COUNT; break;
    }

    GGML_ASSERT(wtype != GGML_TYPE_COUNT);

    return wtype;
}

I've updated to the latest commit on the master branch which should have support for quantized models - are there any extra build steps for running quantized models? I assumed I can just swap out the files.

ggerganov commented 1 year ago

Try to re-quantize the model and see if the problem persists. There was an update in the quantization format recently

vizaiapp commented 1 year ago

Thanks! Looks like my SPM package was on v1.4.1 commit, updating to 1.4.2 commit fixed it.