When built with CoreML, can no longer run normal models

NightMachinery commented 4 months ago

...
whisper_init_state: loading Core ML model from 'models/ggml-distil-large-v3-encoder.mlmodelc'
whisper_init_state: first run on a device may take a while ...
whisper_init_state: failed to load Core ML model from 'models/ggml-distil-large-v3-encoder.mlmodelc'
...

lucaducceschi commented 4 months ago

Hi same issue here. Using the command ./main -m /Users/luca/Gits/whisper.cpp/models/ggml-base-encoder.mlmodelc/coremldata.bin -f file.wav returns

whisper_init_from_file_with_params_no_state: loading model from '/Users/luca/Gits/whisper.cpp/models/ggml-base-encoder.mlmodelc/coremldata.bin' whisper_init_with_params_no_state: use gpu = 1 whisper_init_with_params_no_state: flash attn = 0 whisper_init_with_params_no_state: gpu_device = 0 whisper_init_with_params_no_state: dtw = 0 whisper_model_load: loading model whisper_model_load: invalid model data (bad magic) whisper_init_with_params_no_state: failed to load model error: failed to initialize whisper context

kyjus25 commented 3 months ago

Same with large-v3 (though I assumed a different model wouldn't fix it, just hopeful)

./main -m models/ggml-large-v3-encoder.mlmodelc -f samples/jfk.wav
whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-large-v3-encoder.mlmodelc'
whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw        = 0
whisper_model_load: loading model
whisper_model_load: invalid model data (bad magic)
whisper_init_with_params_no_state: failed to load model
error: failed to initialize whisper context

Update: So, interestingly, if I first generate the CoreML model like above ^, but then also download the regular large-v3 model: ./models/download-ggml-model.sh large-v3, it does seem to use the ggml-large-v3-encoder.mlmodelc folder I created when I run ./main -m models/ggml-large-v3.bin -f samples/jfk.wav

...
whisper_backend_init_gpu: using Metal backend
...
whisper_backend_init_gpu: Metal GPU does not support family 7 - falling back to CPU
...
whisper_init_state: loading Core ML model from 'models/ggml-large-v3-encoder.mlmodelc'
...

Perhaps this is the intended approach, just not documented in the README. However, now I see the error that is described in https://github.com/ggerganov/whisper.cpp/issues/2262

ggerganov / whisper.cpp

When built with CoreML, can no longer run normal models #2278