Open ronyfadel opened 9 months ago
Hm, I can't reproduce on my Mac Studio:
./bin/main -m ../models/ggml-large-v2.bin -f ../samples/gb0.wav --no-gpu
whisper_init_from_file_with_params_no_state: loading model from '../models/ggml-large-v2.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 1280
whisper_model_load: n_text_head = 20
whisper_model_load: n_text_layer = 32
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 5 (large)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs = 99
whisper_model_load: CPU total size = 3093.99 MB
whisper_model_load: model size = 3093.99 MB
whisper_init_state: kv self size = 220.20 MB
whisper_init_state: kv cross size = 245.76 MB
whisper_init_state: compute buffer (conv) = 34.82 MB
whisper_init_state: compute buffer (encode) = 934.34 MB
whisper_init_state: compute buffer (cross) = 9.38 MB
whisper_init_state: compute buffer (decode) = 209.26 MB
system_info: n_threads = 4 / 24 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 |
main: processing '../samples/gb0.wav' (2037686 samples, 127.4 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 1 ...
whisper_model_load: loading model
whisper_model_load: n_vocab = 51864
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 512
whisper_model_load: n_text_head = 8
whisper_model_load: n_text_layer = 6
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 2 (base)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs = 99
whisper_backend_init: using Metal backend
ggml_metal_init: allocatingggml_metal_init: found device: Intel(R) Iris(TM) Plus Graphics
ggml_metal_init: picking default device: Intel(R) Iris(TM) Plus Graphics
ggml_metal_init: loading '/Users/liujianhui/mobile2024/fwhisper/example/build/macos/Build/Products/Debug/fwhisper_example.app/Contents/Frameworks/fwhisper.framework/Resources/default.metallib'
ggml_metal_init: GPU name: Intel(R) Iris(TM) Plus Graphics
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001)
ggml_metal_init: simdgroup reduction support = true
ggml_metal_init: simdgroup matrix mul. support = false
ggml_metal_init: hasUnifiedMemory = true
ggml_metal_init: recommendedMaxWorkingSetSize = 1610.61 MB
ggml_metal_init: skipping kernel_mul_mm_f32_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_f16_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q4_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q4_1_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q5_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q5_1_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q8_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q2_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q3_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q4_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q5_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q6_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq2_xxs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq2_xs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq3_xxs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq3_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq2_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq1_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq4_nl_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq4_xs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_f32_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_f16_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q4_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q4_1_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q5_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q5_1_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q8_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q2_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q3_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q4_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q5_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q6_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq2_xxs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq2_xs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq3_xxs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq3_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq2_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq1_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq4_nl_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq4_xs_f32 (not supported)
whisper_backend_init: Metal GPU does not support family 7 - falling back to CPU
ggml_metal_free: deallocating
whisper_model_load: CPU total size = 147.37 MB
whisper_model_load: model size = 147.37 MB
whisper_model_load: model size = 147.37 MB
whisper_backend_init: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: Intel(R) Iris(TM) Plus Graphics
ggml_metal_init: picking default device: Intel(R) Iris(TM) Plus Graphics
ggml_metal_init: loading '/Users/liujianhui/mobile2024/fwhisper/example/build/macos/Build/Products/Debug/fwhisper_example.app/Contents/Frameworks/fwhisper.framework/Resources/default.metallib'
ggml_metal_init: GPU name: Intel(R) Iris(TM) Plus Graphics
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001)
ggml_metal_init: simdgroup reduction support = true
ggml_metal_init: simdgroup matrix mul. support = false
ggml_metal_init: hasUnifiedMemory = true
ggml_metal_init: recommendedMaxWorkingSetSize = 1610.61 MB
ggml_metal_init: skipping kernel_mul_mm_f32_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_f16_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q4_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q4_1_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q5_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q5_1_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q8_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q2_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q3_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q4_K_f32 (not supported)
whisper_backend_init: Metal GPU does not support family 7 - falling back to CPU
ggml_metal_free: deallocating
whisper_init_state: kv self size = 16.52 MB
whisper_init_state: kv cross size = 18.43 MB
ggml_gallocr_needs_realloc: graph has different number of nodes
ggml_gallocr_alloc_graph: reallocating buffers automatically
ggml_gallocr_reserve_n: reallocating CPU buffer from size 0.00 MiB to 14.01 MiB
whisper_init_state: compute buffer (conv) = 16.39 MB
ggml_gallocr_needs_realloc: graph has different number of nodes
ggml_gallocr_alloc_graph: reallocating buffers automatically
ggml_gallocr_reserve_n: reallocating CPU buffer from size 0.00 MiB to 127.26 MiB
whisper_init_state: compute buffer (encode) = 135.14 MB
ggml_gallocr_needs_realloc: graph has different number of nodes
ggml_gallocr_alloc_graph: reallocating buffers automatically
ggml_gallocr_reserve_n: reallocating CPU buffer from size 0.00 MiB to 2.93 MiB
whisper_init_state: compute buffer (cross) = 4.78 MB
ggml_gallocr_needs_realloc: graph has different number of nodes
ggml_gallocr_alloc_graph: reallocating buffers automatically
ggml_gallocr_reserve_n: reallocating CPU buffer from size 0.00 MiB to 90.38 MiB
whisper_init_state: compute buffer (decode) = 96.48 MB```
ggml_gallocr_needs_realloc: graph has different number of nodes
ggml_gallocr_alloc_graph: reallocating buffers automatically
ggml_gallocr_needs_realloc: node node_27 is not valid
ggml_gallocr_alloc_graph: reallocating buffers automatically
ggml_gallocr_needs_realloc: graph has different number of nodes ggml_gallocr_alloc_graph: reallocating buffers automatically ggml_gallocr_needs_realloc: node node_27 is not valid ggml_gallocr_alloc_graph: reallocating buffers automatically
Despite the aforementioned output being displayed, the escape operation can be executed to completion successfully.
whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-base.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51864
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 512
whisper_model_load: n_text_head = 8
whisper_model_load: n_text_layer = 6
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 2 (base)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs = 99
whisper_backend_init: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: Intel(R) Iris(TM) Plus Graphics
ggml_metal_init: picking default device: Intel(R) Iris(TM) Plus Graphics
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
ggml_metal_init: loading '/Users/liujianhui/www2024/whisper.cpp/ggml-metal.metal'
ggml_metal_init: GPU name: Intel(R) Iris(TM) Plus Graphics
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001)
ggml_metal_init: simdgroup reduction support = true
ggml_metal_init: simdgroup matrix mul. support = false
ggml_metal_init: hasUnifiedMemory = true
ggml_metal_init: recommendedMaxWorkingSetSize = 1610.61 MB
ggml_metal_init: skipping kernel_mul_mm_f32_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_f16_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q4_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q4_1_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q5_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q5_1_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q8_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q2_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q3_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q4_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q5_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q6_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq2_xxs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq2_xs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq3_xxs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq3_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq2_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq1_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq4_nl_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq4_xs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_f32_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_f16_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q4_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q4_1_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q5_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q5_1_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q8_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q2_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q3_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q4_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q5_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q6_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq2_xxs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq2_xs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq3_xxs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq3_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq2_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq1_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq4_nl_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq4_xs_f32 (not supported)
whisper_backend_init: Metal GPU does not support family 7 - falling back to CPU
ggml_metal_free: deallocating
whisper_model_load: CPU total size = 147.37 MB
whisper_model_load: model size = 147.37 MB
whisper_backend_init: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: Intel(R) Iris(TM) Plus Graphics
ggml_metal_init: picking default device: Intel(R) Iris(TM) Plus Graphics
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
ggml_metal_init: loading '/Users/liujianhui/www2024/whisper.cpp/ggml-metal.metal'
ggml_metal_init: GPU name: Intel(R) Iris(TM) Plus Graphics
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001)
ggml_metal_init: simdgroup reduction support = true
ggml_metal_init: simdgroup matrix mul. support = false
ggml_metal_init: hasUnifiedMemory = true
ggml_metal_init: recommendedMaxWorkingSetSize = 1610.61 MB
ggml_metal_init: skipping kernel_mul_mm_f32_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_f16_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q4_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q4_1_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q5_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q5_1_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q8_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q2_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q3_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q4_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q5_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_q6_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq2_xxs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq2_xs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq3_xxs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq3_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq2_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq1_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq4_nl_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_iq4_xs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_f32_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_f16_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q4_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q4_1_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q5_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q5_1_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q8_0_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q2_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q3_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q4_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q5_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_q6_K_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq2_xxs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq2_xs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq3_xxs_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq3_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq2_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq1_s_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq4_nl_f32 (not supported)
ggml_metal_init: skipping kernel_mul_mm_id_iq4_xs_f32 (not supported)
whisper_backend_init: Metal GPU does not support family 7 - falling back to CPU
ggml_metal_free: deallocating
whisper_init_state: kv self size = 16.52 MB
whisper_init_state: kv cross size = 18.43 MB
whisper_init_state: compute buffer (conv) = 16.39 MB
whisper_init_state: compute buffer (encode) = 135.14 MB
whisper_init_state: compute buffer (cross) = 4.78 MB
whisper_init_state: compute buffer (decode) = 96.48 MB
file fname_inp: samples/queen.wav
system_info: n_threads = 4 / 8 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | METAL = 1 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0
main: processing 'samples/queen.wav' (4308421 samples, 269.3 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 1 ...
whisper_model_load: loading model whisper_model_load: n_vocab = 51864 whisper_model_load: n_audio_ctx = 1500 whisper_model_load: n_audio_state = 512 whisper_model_load: n_audio_head = 8 whisper_model_load: n_audio_layer = 6 whisper_model_load: n_text_ctx = 448 whisper_model_load: n_text_state = 512 whisper_model_load: n_text_head = 8 whisper_model_load: n_text_layer = 6 whisper_model_load: n_mels = 80 whisper_model_load: ftype = 1 whisper_model_load: qntvr = 0 whisper_model_load: type = 2 (base) whisper_model_load: adding 1607 extra tokens whisper_model_load: n_langs = 99 whisper_backend_init: using Metal backend ggml_metal_init: allocatingggml_metal_init: found device: Intel(R) Iris(TM) Plus Graphics ggml_metal_init: picking default device: Intel(R) Iris(TM) Plus Graphics ggml_metal_init: loading '/Users/liujianhui/mobile2024/fwhisper/example/build/macos/Build/Products/Debug/fwhisper_example.app/Contents/Frameworks/fwhisper.framework/Resources/default.metallib' ggml_metal_init: GPU name: Intel(R) Iris(TM) Plus Graphics ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003) ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001) ggml_metal_init: simdgroup reduction support = true ggml_metal_init: simdgroup matrix mul. support = false ggml_metal_init: hasUnifiedMemory = true ggml_metal_init: recommendedMaxWorkingSetSize = 1610.61 MB ggml_metal_init: skipping kernel_mul_mm_f32_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_f16_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q4_0_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q4_1_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q5_0_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q5_1_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q8_0_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q2_K_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q3_K_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q4_K_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q5_K_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q6_K_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_iq2_xxs_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_iq2_xs_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_iq3_xxs_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_iq3_s_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_iq2_s_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_iq1_s_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_iq4_nl_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_iq4_xs_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_f32_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_f16_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_q4_0_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_q4_1_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_q5_0_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_q5_1_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_q8_0_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_q2_K_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_q3_K_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_q4_K_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_q5_K_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_q6_K_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_iq2_xxs_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_iq2_xs_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_iq3_xxs_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_iq3_s_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_iq2_s_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_iq1_s_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_iq4_nl_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_id_iq4_xs_f32 (not supported) whisper_backend_init: Metal GPU does not support family 7 - falling back to CPU ggml_metal_free: deallocating whisper_model_load: CPU total size = 147.37 MB whisper_model_load: model size = 147.37 MB whisper_model_load: model size = 147.37 MB whisper_backend_init: using Metal backend ggml_metal_init: allocating ggml_metal_init: found device: Intel(R) Iris(TM) Plus Graphics ggml_metal_init: picking default device: Intel(R) Iris(TM) Plus Graphics ggml_metal_init: loading '/Users/liujianhui/mobile2024/fwhisper/example/build/macos/Build/Products/Debug/fwhisper_example.app/Contents/Frameworks/fwhisper.framework/Resources/default.metallib' ggml_metal_init: GPU name: Intel(R) Iris(TM) Plus Graphics ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003) ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001) ggml_metal_init: simdgroup reduction support = true ggml_metal_init: simdgroup matrix mul. support = false ggml_metal_init: hasUnifiedMemory = true ggml_metal_init: recommendedMaxWorkingSetSize = 1610.61 MB ggml_metal_init: skipping kernel_mul_mm_f32_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_f16_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q4_0_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q4_1_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q5_0_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q5_1_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q8_0_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q2_K_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q3_K_f32 (not supported) ggml_metal_init: skipping kernel_mul_mm_q4_K_f32 (not supported) whisper_backend_init: Metal GPU does not support family 7 - falling back to CPU ggml_metal_free: deallocating whisper_init_state: kv self size = 16.52 MB whisper_init_state: kv cross size = 18.43 MB ggml_gallocr_needs_realloc: graph has different number of nodes ggml_gallocr_alloc_graph: reallocating buffers automatically ggml_gallocr_reserve_n: reallocating CPU buffer from size 0.00 MiB to 14.01 MiB whisper_init_state: compute buffer (conv) = 16.39 MB ggml_gallocr_needs_realloc: graph has different number of nodes ggml_gallocr_alloc_graph: reallocating buffers automatically ggml_gallocr_reserve_n: reallocating CPU buffer from size 0.00 MiB to 127.26 MiB whisper_init_state: compute buffer (encode) = 135.14 MB ggml_gallocr_needs_realloc: graph has different number of nodes ggml_gallocr_alloc_graph: reallocating buffers automatically ggml_gallocr_reserve_n: reallocating CPU buffer from size 0.00 MiB to 2.93 MiB whisper_init_state: compute buffer (cross) = 4.78 MB ggml_gallocr_needs_realloc: graph has different number of nodes ggml_gallocr_alloc_graph: reallocating buffers automatically ggml_gallocr_reserve_n: reallocating CPU buffer from size 0.00 MiB to 90.38 MiB whisper_init_state: compute buffer (decode) = 96.48 MB```
'OTHER_CFLAGS' => ['$(inherited)', '-Wno-shorten-64-to-32', '-O3', '-flto', '-std=c11', '-fPIC', '-fno-objc-arc'],
'OTHER_CPLUSPLUSFLAGS' => ['$(inherited)', '-Wno-shorten-64-to-32', '-O3', '-flto', '-fPIC', '-std=c++11', '-fno-objc-arc'],
'GCC_PREPROCESSOR_DEFINITIONS' => ['$(inherited)', 'GGML_USE_METAL=1'],
The problem has been solved.
s.pod_target_xcconfig` = { 'DEFINES_MODULE' => 'YES', 'OTHER_CFLAGS' => ['$(inherited)', '-O3', '-flto', '-DNDEBUG', '-std=c11', '-fPIC', '-D_XOPEN_SOURCE=600', '-D_DARWIN_C_SOURCE', '-pthread','-mavx', '-mavx2','-mfma', '-mf16c', '-msse3', '-mssse3', '-fno-objc-arc'], 'OTHER_CPLUSPLUSFLAGS' => ['$(inherited)','-O3', '-flto','-DNDEBUG', '-fPIC', '-D_XOPEN_SOURCE=600', '-D_DARWIN_C_SOURCE', '-pthread','-mavx', '-mavx2','-mfma', '-mf16c', '-msse3', '-mssse3', '-std=c++11', '-fno-objc-arc'], 'GCC_PREPROCESSOR_DEFINITIONS' => ['$(inherited)', 'GGML_USE_METAL=1', 'DACCELERATE_NEW_LAPACK', 'DACCELERATE_LAPACK_ILP64'], }
The problem has been solved.
s.pod_target_xcconfig` = { 'DEFINES_MODULE' => 'YES', 'OTHER_CFLAGS' => ['$(inherited)', '-O3', '-flto', '-DNDEBUG', '-std=c11', '-fPIC', '-D_XOPEN_SOURCE=600', '-D_DARWIN_C_SOURCE', '-pthread','-mavx', '-mavx2','-mfma', '-mf16c', '-msse3', '-mssse3', '-fno-objc-arc'], 'OTHER_CPLUSPLUSFLAGS' => ['$(inherited)','-O3', '-flto','-DNDEBUG', '-fPIC', '-D_XOPEN_SOURCE=600', '-D_DARWIN_C_SOURCE', '-pthread','-mavx', '-mavx2','-mfma', '-mf16c', '-msse3', '-mssse3', '-std=c++11', '-fno-objc-arc'], 'GCC_PREPROCESSOR_DEFINITIONS' => ['$(inherited)', 'GGML_USE_METAL=1', 'DACCELERATE_NEW_LAPACK', 'DACCELERATE_LAPACK_ILP64'], }
How do you resolve it? in my issue, I also found:
ggml_gallocr_needs_realloc: graph has different number of nodes
ggml_gallocr_alloc_graph: reallocating buffers automatically
@jianhuihi this is a different issue stop commenting here please
It doesn't crash if I run ./bin/main -m ../models/ggml-large-v2.bin -f ../samples/gb0.wav --no-gpu
but it does in Xcode. Hmmm 🤔.
Is this expected?
I'm on master, commit hash:
59119f4f20b27
Machine: Apple M1 Pro, 14-inch, Sonoma 14.3.1
Logs:
screenshot:
trace: trace.txt