ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
33.17k stars 3.34k forks source link

got error when using openvino #1394

Open jackleibest opened 8 months ago

jackleibest commented 8 months ago

according to the manual, i just wanna speed up inference on the CPU via OpenVINO, however got the problem as bellow. (openvino_conv_env) [root@zaozhuang3L-C6-35 whisper.cpp]# ./main -m models/ggml-base.en-encoder-openvino.bin -f samples/jfk.wav whisper_init_from_file_no_state: loading model from 'models/ggml-base.en-encoder-openvino.bin' whisper_model_load: loading model whisper_model_load: invalid model data (bad magic) whisper_init_no_state: failed to load model error: failed to initialize whisper context

ilya-lavrenov commented 8 months ago

CC @RyanMetcalfeInt8

bobqianic commented 8 months ago

according to the manual, i just wanna speed up inference on the CPU via OpenVINO, however got the problem as bellow. (openvino_conv_env) [root@zaozhuang3L-C6-35 whisper.cpp]# ./main -m models/ggml-base.en-encoder-openvino.bin -f samples/jfk.wav whisper_init_from_file_no_state: loading model from 'models/ggml-base.en-encoder-openvino.bin' whisper_model_load: loading model whisper_model_load: invalid model data (bad magic) whisper_init_no_state: failed to load model error: failed to initialize whisper context

To use OpenVINO, you'll need two models: the original whisper ggml model and the OpenVINO-converted model. Make sure to place both models in the same directory and provide the path of the original whisper ggml model when you run the main program.

jackleibest commented 8 months ago

未标题-1 place both models under the same directory, however it shows illegal instruction

ilya-lavrenov commented 8 months ago

@jackleibest what machine do you use? is it inside docker?

jackleibest commented 8 months ago

centos 7.9 without docker

bobqianic commented 8 months ago

centos 7.9 without docker

Do you have the OpenVINO toolkit installed on your machine?

jackleibest commented 8 months ago

yes, following the instructions:

cd models
python3 -m venv openvino_conv_env
source openvino_conv_env/bin/activate
python -m pip install --upgrade pip
pip install -r openvino-conversion-requirements.txt
python convert-whisper-to-openvino.py --model medium
source /path/to/l_openvino_toolkit_ubuntu22_2023.0.0.10926.b4452d56304_x86_64/setupvars.sh
cmake -B build -DWHISPER_OPENVINO=1
cmake --build build -j --config Release
./main -m models/ggml-medium.bin -f samples/jfk.wav
rkilchmn commented 5 months ago

I have exact same issue using Ubuntu 22.04 in WSL2 on Windows 11. I have a laptop Gen11 CPU with Gen12 GPU and openVino installed

~/whisper.cpp$ ./build/bin/main -m models/ggml-base.en-encoder-openvino.bin -f samples/jfk.wav whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-base.en-encoder-openvino.bin' whisper_model_load: loading model whisper_model_load: invalid model data (bad magic) whisper_init_with_params_no_state: failed to load model error: failed to initialize whisper context

Models have been successfully converted:

image
rkilchmn commented 5 months ago

I found solution here: [https://github.com/ggerganov/whisper.cpp/pull/1694#issuecomment-1870984510]

You only need to provide the path of the standard model to the main. Ensure that both the standard model and the OPENVINO encoder model have matching names and are located in the same directory. For instance, if the standard model is named ABCD.bin, then the corresponding OPENVINO model should be named ABCD-encoder-openvino.bin

working command is:./build/bin/main -m models/ggml-base.bin -f samples/jfk.wav

I think this can be closed

lus105 commented 5 months ago

I also observed the same issue. However if the model path only contains ggml-base.bin it works the same way. Moreover, running benchmark is almost identical to inference without accelerators.

zachs-55 commented 2 months ago

未标题-1 place both models under the same directory, however it shows illegal instruction

Have a look at CMakeLists.txt, I'd guess you need to set additional flags like -DWHISPER_NO_AVX=ON -DWHISPER_NO_AVX2=ON etc. I ran into this today.

daniel156161 commented 3 weeks ago

when i run the Command i get mbind failed: Invalid argument very often in the Whisper Output

Command:

whisper.cpp -l de -m models/ggml-large-v3.bin /tmp/test_file.wav

Whisper Output:

whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-large-v3.bin' whisper_init_with_params_no_state: use gpu = 1 whisper_init_with_params_no_state: flash attn = 0 whisper_init_with_params_no_state: gpu_device = 0 whisper_init_with_params_no_state: dtw = 0 whisper_model_load: loading model whisper_model_load: n_vocab = 51866 whisper_model_load: n_audio_ctx = 1500 whisper_model_load: n_audio_state = 1280 whisper_model_load: n_audio_head = 20 whisper_model_load: n_audio_layer = 32 whisper_model_load: n_text_ctx = 448 whisper_model_load: n_text_state = 1280 whisper_model_load: n_text_head = 20 whisper_model_load: n_text_layer = 32 whisper_model_load: n_mels = 128 whisper_model_load: ftype = 1 whisper_model_load: qntvr = 0 whisper_model_load: type = 5 (large v3) whisper_model_load: adding 1609 extra tokens whisper_model_load: n_langs = 100 whisper_model_load: CPU total size = 3094.36 MB whisper_model_load: model size = 3094.36 MB whisper_init_state: kv self size = 251.66 MB whisper_init_state: kv cross size = 251.66 MB whisper_init_state: kv pad size = 7.86 MB whisper_init_state: compute buffer (conv) = 36.26 MB whisper_init_state: compute buffer (encode) = 926.66 MB whisper_init_state: compute buffer (cross) = 9.38 MB whisper_init_state: compute buffer (decode) = 213.19 MB whisper_ctx_init_openvino_encoder: loading OpenVINO model from 'models/ggml-large-v3-encoder-openvino.xml' whisper_ctx_init_openvino_encoder: first run on a device may take a while ... whisper_openvino_init: path_model = models/ggml-large-v3-encoder-openvino.xml, device = CPU, cache_dir = models/ggml-large-v3-encoder-openvino-cache mbind failed: Invalid argument whisper_ctx_init_openvino_encoder: OpenVINO model loaded

system_info: n_threads = 4 / 20 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | METAL = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 1

main: processing '/tmp/test_file.wav' (4612180 samples, 288.3 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = de, task = transcribe, timestamps = 1 ...

mbind failed: Invalid argument ..... (mbind failed: Invalid argument keeps repeating) mbind failed: Invalid argument after some time i get the Text Output of my Audio File