ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
35.57k stars 3.62k forks source link

Segmentation fault while running talk on mac m1 max #974

Open ZechenM opened 1 year ago

ZechenM commented 1 year ago

Hi I am getting a segfault while running talk. With "-p Santa" enabled (like the talk readme suggests) or without any flags at all (i.e. just "./talk"), I am all getting segfaults. This was mentioned in #782, but it seems the issue still remains. Thanks in advance for any tips!

(py310-whisper) whisper.cpp % ./talk -p Santa
whisper_init_from_file_no_state: loading model from 'models/ggml-base.en.bin' whisper_model_load: loading model whisper_model_load: n_vocab = 51864 whisper_model_load: n_audio_ctx = 1500 whisper_model_load: n_audio_state = 512 whisper_model_load: n_audio_head = 8 whisper_model_load: n_audio_layer = 6 whisper_model_load: n_text_ctx = 448 whisper_model_load: n_text_state = 512 whisper_model_load: n_text_head = 8 whisper_model_load: n_text_layer = 6 whisper_model_load: n_mels = 80 whisper_model_load: ftype = 1 whisper_model_load: qntvr = 0 whisper_model_load: type = 2 whisper_model_load: mem required = 310.00 MB (+ 6.00 MB per decoder) whisper_model_load: adding 1607 extra tokens whisper_model_load: model ctx = 140.66 MB whisper_model_load: model size = 140.54 MB whisper_init_state: kv self size = 5.25 MB whisper_init_state: kv cross size = 17.58 MB whisper_init_state: loading Core ML model from 'models/ggml-base.en-encoder.mlmodelc' whisper_init_state: first run on a device may take a while ... whisper_init_state: Core ML model loaded gpt2_model_load: loading model from 'models/ggml-gpt-2-117M.bin' gpt2_model_load: failed to open 'models/ggml-gpt-2-117M.bin' gpt2_init: failed to load model from 'models/ggml-gpt-2-117M.bin'

main: processing, 4 threads, lang = en, task = transcribe, timestamps = 0 ...

init: found 4 capture devices: init: - Capture device #0: 'Zechen’s AirPods Pro #2' init: - Capture device #1: 'Z’s iPhone Microphone' init: - Capture device #2: 'MacBook Pro Microphone' init: - Capture device #3: 'ZoomAudioDevice' init: attempt to open default capture device ... init: obtained spec for input device (SDL Id = 2): init: - sample rate: 16000 init: - format: 33056 (required: 33056) init: - channels: 1 (required: 1) init: - samples per frame: 1024 zsh: segmentation fault ./talk -p Santa

Originally posted by @ZechenM in https://github.com/ggerganov/whisper.cpp/issues/782#issuecomment-1569312708

Updates on 5/31 1:43AM PST:

I realized I actually didn't load the gpt-2 model. In fact, the wget method doesn't work unfortunately as mentioned here (https://github.com/ggerganov/whisper.cpp/tree/master/examples/talk). Instead, I followed the gpt-2 installation instructions (https://github.com/ggerganov/ggml/tree/master/examples/gpt-2#downloading-and-converting-the-original-models) to download the ggml model directly.

However, I got a new and different segfault:

(base) zechenma@Zechens-MacBook-Pro whisper.cpp % ./talk -p Santa whisper_init_from_file_no_state: loading model from 'models/ggml-base.en.bin' whisper_model_load: loading model whisper_model_load: n_vocab = 51864 whisper_model_load: n_audio_ctx = 1500 whisper_model_load: n_audio_state = 512 whisper_model_load: n_audio_head = 8 whisper_model_load: n_audio_layer = 6 whisper_model_load: n_text_ctx = 448 whisper_model_load: n_text_state = 512 whisper_model_load: n_text_head = 8 whisper_model_load: n_text_layer = 6 whisper_model_load: n_mels = 80 whisper_model_load: ftype = 1 whisper_model_load: qntvr = 0 whisper_model_load: type = 2 whisper_model_load: mem required = 310.00 MB (+ 6.00 MB per decoder) whisper_model_load: adding 1607 extra tokens whisper_model_load: model ctx = 140.66 MB whisper_model_load: model size = 140.54 MB whisper_init_state: kv self size = 5.25 MB whisper_init_state: kv cross size = 17.58 MB gpt2_model_load: loading model from 'models/ggml-model.bin' gpt2_model_load: n_vocab = 50257 gpt2_model_load: n_ctx = 1024 gpt2_model_load: n_embd = 768 gpt2_model_load: n_head = 12 gpt2_model_load: n_layer = 12 gpt2_model_load: ftype = 1 gpt2_model_load: ggml ctx size = 384.74 MB ggml_new_tensor_impl: not enough space in the context's memory pool (needed 403426048, available 403425792) zsh: segmentation fault ./talk -p Santa

BluBb-mADe commented 10 months ago

I ran into this problem as well but on windows. The actual amount of bytes I ended up needing to solve this in my case was "needed - available + GGML_OBJECT_SIZE". I just added a line after gpt-2.cpp#L186 like this ctx_size += 24384 + GGML_OBJECT_SIZE; which fixed the problem. I suspect it might be related to different compiler-dependent padding behavior of some structs but I couldn't be bothered to figure out where the difference actually arose from. Be warned though, even with this fix I wasn't able to actually get this example running yet. There seems to be some fairly recent changes that broke something strangely obvious deep inside of ggml.c when running GPT2 on it. So you might be better off trying your luck with a different example.