Open jackleibest opened 8 months ago
CC @RyanMetcalfeInt8
according to the manual, i just wanna speed up inference on the CPU via OpenVINO, however got the problem as bellow. (openvino_conv_env) [root@zaozhuang3L-C6-35 whisper.cpp]# ./main -m models/ggml-base.en-encoder-openvino.bin -f samples/jfk.wav whisper_init_from_file_no_state: loading model from 'models/ggml-base.en-encoder-openvino.bin' whisper_model_load: loading model whisper_model_load: invalid model data (bad magic) whisper_init_no_state: failed to load model error: failed to initialize whisper context
To use OpenVINO, you'll need two models: the original whisper ggml model and the OpenVINO-converted model. Make sure to place both models in the same directory and provide the path of the original whisper ggml model when you run the main program.
place both models under the same directory, however it shows illegal instruction
@jackleibest what machine do you use? is it inside docker?
centos 7.9 without docker
centos 7.9 without docker
Do you have the OpenVINO toolkit installed on your machine?
yes, following the instructions:
cd models
python3 -m venv openvino_conv_env
source openvino_conv_env/bin/activate
python -m pip install --upgrade pip
pip install -r openvino-conversion-requirements.txt
python convert-whisper-to-openvino.py --model medium
source /path/to/l_openvino_toolkit_ubuntu22_2023.0.0.10926.b4452d56304_x86_64/setupvars.sh
cmake -B build -DWHISPER_OPENVINO=1
cmake --build build -j --config Release
./main -m models/ggml-medium.bin -f samples/jfk.wav
I have exact same issue using Ubuntu 22.04 in WSL2 on Windows 11. I have a laptop Gen11 CPU with Gen12 GPU and openVino installed
~/whisper.cpp$ ./build/bin/main -m models/ggml-base.en-encoder-openvino.bin -f samples/jfk.wav whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-base.en-encoder-openvino.bin' whisper_model_load: loading model whisper_model_load: invalid model data (bad magic) whisper_init_with_params_no_state: failed to load model error: failed to initialize whisper context
Models have been successfully converted:
I found solution here: [https://github.com/ggerganov/whisper.cpp/pull/1694#issuecomment-1870984510]
You only need to provide the path of the standard model to the main. Ensure that both the standard model and the OPENVINO encoder model have matching names and are located in the same directory. For instance, if the standard model is named ABCD.bin, then the corresponding OPENVINO model should be named ABCD-encoder-openvino.bin
working command is:./build/bin/main -m models/ggml-base.bin -f samples/jfk.wav
I think this can be closed
I also observed the same issue. However if the model path only contains ggml-base.bin it works the same way. Moreover, running benchmark is almost identical to inference without accelerators.
place both models under the same directory, however it shows illegal instruction
Have a look at CMakeLists.txt, I'd guess you need to set additional flags like -DWHISPER_NO_AVX=ON -DWHISPER_NO_AVX2=ON etc. I ran into this today.
when i run the Command i get mbind failed: Invalid argument very often in the Whisper Output
whisper.cpp -l de -m models/ggml-large-v3.bin /tmp/test_file.wav
whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-large-v3.bin' whisper_init_with_params_no_state: use gpu = 1 whisper_init_with_params_no_state: flash attn = 0 whisper_init_with_params_no_state: gpu_device = 0 whisper_init_with_params_no_state: dtw = 0 whisper_model_load: loading model whisper_model_load: n_vocab = 51866 whisper_model_load: n_audio_ctx = 1500 whisper_model_load: n_audio_state = 1280 whisper_model_load: n_audio_head = 20 whisper_model_load: n_audio_layer = 32 whisper_model_load: n_text_ctx = 448 whisper_model_load: n_text_state = 1280 whisper_model_load: n_text_head = 20 whisper_model_load: n_text_layer = 32 whisper_model_load: n_mels = 128 whisper_model_load: ftype = 1 whisper_model_load: qntvr = 0 whisper_model_load: type = 5 (large v3) whisper_model_load: adding 1609 extra tokens whisper_model_load: n_langs = 100 whisper_model_load: CPU total size = 3094.36 MB whisper_model_load: model size = 3094.36 MB whisper_init_state: kv self size = 251.66 MB whisper_init_state: kv cross size = 251.66 MB whisper_init_state: kv pad size = 7.86 MB whisper_init_state: compute buffer (conv) = 36.26 MB whisper_init_state: compute buffer (encode) = 926.66 MB whisper_init_state: compute buffer (cross) = 9.38 MB whisper_init_state: compute buffer (decode) = 213.19 MB whisper_ctx_init_openvino_encoder: loading OpenVINO model from 'models/ggml-large-v3-encoder-openvino.xml' whisper_ctx_init_openvino_encoder: first run on a device may take a while ... whisper_openvino_init: path_model = models/ggml-large-v3-encoder-openvino.xml, device = CPU, cache_dir = models/ggml-large-v3-encoder-openvino-cache mbind failed: Invalid argument whisper_ctx_init_openvino_encoder: OpenVINO model loaded
system_info: n_threads = 4 / 20 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | METAL = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 1
main: processing '/tmp/test_file.wav' (4612180 samples, 288.3 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = de, task = transcribe, timestamps = 1 ...
mbind failed: Invalid argument ..... (mbind failed: Invalid argument keeps repeating) mbind failed: Invalid argument after some time i get the Text Output of my Audio File
according to the manual, i just wanna speed up inference on the CPU via OpenVINO, however got the problem as bellow. (openvino_conv_env) [root@zaozhuang3L-C6-35 whisper.cpp]# ./main -m models/ggml-base.en-encoder-openvino.bin -f samples/jfk.wav whisper_init_from_file_no_state: loading model from 'models/ggml-base.en-encoder-openvino.bin' whisper_model_load: loading model whisper_model_load: invalid model data (bad magic) whisper_init_no_state: failed to load model error: failed to initialize whisper context