100% crash when calling whisper_init_from_file_with_params with Address Sanitizer enabled (Xcode, ggml-large-v2) on both use_gpu={0,1}

ronyfadel commented 9 months ago

Is this expected?

I'm on master, commit hash: 59119f4f20b27

Machine: Apple M1 Pro, 14-inch, Sonoma 14.3.1

Logs:

whisper_init_from_file_with_params_no_state: loading model from 'ggml-large-v2.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head  = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1280
whisper_model_load: n_text_head   = 20
whisper_model_load: n_text_layer  = 32
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 5 (large)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs       = 99
whisper_model_load:      CPU total size =  3093.99 MB

whisper_model_load: model size    = 3093.99 MB
whisper_init_state: kv self size  =  220.20 MB
whisper_init_state: kv cross size =  245.76 MB
ggml_gallocr_needs_realloc: graph has different number of nodes
ggml_gallocr_alloc_graph: reallocating buffers automatically
ggml_gallocr_reserve_n: reallocating CPU buffer from size 0.00 MiB to 31.59 MiB
whisper_init_state: compute buffer (conv)   =   34.82 MB

screenshot:

trace: trace.txt

ggerganov commented 9 months ago

Hm, I can't reproduce on my Mac Studio:

./bin/main -m ../models/ggml-large-v2.bin -f ../samples/gb0.wav --no-gpu
whisper_init_from_file_with_params_no_state: loading model from '../models/ggml-large-v2.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head  = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1280
whisper_model_load: n_text_head   = 20
whisper_model_load: n_text_layer  = 32
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 5 (large)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs       = 99
whisper_model_load:      CPU total size =  3093.99 MB
whisper_model_load: model size    = 3093.99 MB
whisper_init_state: kv self size  =  220.20 MB
whisper_init_state: kv cross size =  245.76 MB
whisper_init_state: compute buffer (conv)   =   34.82 MB
whisper_init_state: compute buffer (encode) =  934.34 MB
whisper_init_state: compute buffer (cross)  =    9.38 MB
whisper_init_state: compute buffer (decode) =  209.26 MB

system_info: n_threads = 4 / 24 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 | 

main: processing '../samples/gb0.wav' (2037686 samples, 127.4 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 1 ...