ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
34.31k stars 3.48k forks source link

Not compiling on m1 mac #204

Closed knpwrs closed 1 year ago

knpwrs commented 1 year ago

Whenever I run make I see the following output:

❯ make
Makefile:21: Your arch is announced as x86_64, but it seems to actually be ARM64. Not fixing that can lead to bad performance. For more info see: https://github.com/ggerganov/whisper.cpp/issues/66\#issuecomment-1282546789
gcc-11  -I.              -O3 -std=c11   -pthread -DGGML_USE_ACCELERATE   -c ggml.c -o ggml.o
In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Headers/../Frameworks/vecLib.framework/Headers/vecLib.h:25,
                 from /Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Headers/Accelerate.h:20,
                 from ggml.c:96:
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h: In function 'vU64Sub':
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h:658:3: note: use '-flax-vector-conversions' to permit conversions between vectors with differing element types or numbers of subparts
  658 |   vUInt32   __vbasicops_vB) { return vsubq_u64( (uint64x2_t)__vbasicops_vA, (uint64x2_t)__vbasicops_vB); }
      |   ^~~~~~~
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h:658:38: error: incompatible types when returning type 'uint64x2_t' but 'vUInt32' was expected
  658 |   vUInt32   __vbasicops_vB) { return vsubq_u64( (uint64x2_t)__vbasicops_vA, (uint64x2_t)__vbasicops_vB); }
      |                                      ^
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h: In function 'vS64Sub':
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h:733:38: error: incompatible types when returning type 'int64x2_t' but 'vUInt32' was expected
  733 |   vUInt32   __vbasicops_vB) { return vsubq_s64( (int64x2_t)__vbasicops_vA, (int64x2_t)__vbasicops_vB); }
      |                                      ^
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h: In function 'vU64Add':
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h:805:38: error: incompatible types when returning type 'uint64x2_t' but 'vUInt32' was expected
  805 |   vUInt32   __vbasicops_vB) { return vaddq_u64( (uint64x2_t)__vbasicops_vA, (uint64x2_t)__vbasicops_vB); }
      |                                      ^
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h: In function 'vS64Add':
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h:875:38: error: incompatible types when returning type 'int64x2_t' but 'vUInt32' was expected
  875 |   vSInt32   __vbasicops_vB) { return vaddq_s64( (int64x2_t)__vbasicops_vA, (int64x2_t)__vbasicops_vB); }
      |                                      ^
make: *** [Makefile:125: ggml.o] Error 1
zsh: exit 2     make

I saw the link to https://github.com/ggerganov/whisper.cpp/issues/66\#issuecomment-1282546789 but I'm not seeing anything that helps out. On a hunch from that thread I thought I might be having an issue due to using Alacritty as my terminal emulator with Tmux as a multiplexer. So I opened Terminal.app and... it worked!

❯ make base.en
Makefile:21: Your arch is announced as x86_64, but it seems to actually be ARM64. Not fixing that can lead to bad performance. For more info see: https://github.com/ggerganov/whisper.cpp/issues/66\#issuecomment-1282546789
g++ -I. -I./examples -O3 -std=c++11 -pthread examples/main/main.cpp ggml.o whisper.o -o main -framework Accelerate
./main -h

usage: ./main [options] file0.wav file1.wav ...

options:
  -h,       --help          [default] show this help message and exit
  -t N,     --threads N     [4      ] number of threads to use during computation
  -p N,     --processors N  [1      ] number of processors to use during computation
  -ot N,    --offset-t N    [0      ] time offset in milliseconds
  -on N,    --offset-n N    [0      ] segment index offset
  -d  N,    --duration N    [0      ] duration of audio to process in milliseconds
  -mc N,    --max-context N [-1     ] maximum number of text context tokens to store
  -ml N,    --max-len N     [0      ] maximum segment length in characters
  -wt N,    --word-thold N  [0.01   ] word timestamp probability threshold
  -su,      --speed-up      [false  ] speed up audio by x2 (reduced accuracy)
  -tr,      --translate     [false  ] translate from source language to english
  -di,      --diarize       [false  ] stereo audio diarization
  -otxt,    --output-txt    [false  ] output result in a text file
  -ovtt,    --output-vtt    [false  ] output result in a vtt file
  -osrt,    --output-srt    [false  ] output result in a srt file
  -owts,    --output-words  [false  ] output script for generating karaoke video
  -ps,      --print-special [false  ] print special tokens
  -pc,      --print-colors  [false  ] print colors
  -nt,      --no-timestamps [true   ] do not print timestamps
  -l LANG,  --language LANG [en     ] spoken language
  -m FNAME, --model FNAME   [models/ggml-base.en.bin] model path
  -f FNAME, --file FNAME    [       ] input WAV file path

bash ./models/download-ggml-model.sh base.en
Downloading ggml model base.en from 'https://huggingface.co/datasets/ggerganov/whisper.cpp' ...
Model base.en already exists. Skipping download.

===============================================
Running base.en on all samples in ./samples ...
===============================================

----------------------------------------------
[+] Running base.en on samples/jfk.wav ... (run 'ffplay samples/jfk.wav' to listen)
----------------------------------------------

whisper_model_load: loading model from 'models/ggml-base.en.bin'
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head  = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 512
whisper_model_load: n_text_head   = 8
whisper_model_load: n_text_layer  = 6
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1
whisper_model_load: type          = 2
whisper_model_load: adding 1607 extra tokens
whisper_model_load: mem_required  =  506.00 MB
whisper_model_load: ggml ctx size =  140.60 MB
whisper_model_load: memory size   =   22.83 MB
whisper_model_load: model size    =  140.54 MB

system_info: n_threads = 4 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | NEON = 1 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | 

main: processing 'samples/jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, lang = en, task = transcribe, timestamps = 1 ...

[00:00:00.000 --> 00:00:11.000]   And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country.

whisper_print_timings:     load time =   173.35 ms
whisper_print_timings:      mel time =    24.87 ms
whisper_print_timings:   sample time =     3.84 ms
whisper_print_timings:   encode time =   334.11 ms / 55.69 ms per layer
whisper_print_timings:   decode time =    86.87 ms / 14.48 ms per layer
whisper_print_timings:    total time =   623.39 ms

When I went to check Activity Monitor.app, however, it appeared that Alcritty, tmux, and zsh were all Apple processes, not Intel (i.e., not running under Rosetta):

image image image

Confused, I decided to try make clean and make again in my Alacritty/tmux/zsh setup, and saw the same error as I did originally.

Here's the real kicker though: I went back to Terminal.app, plain zsh, no tmux, and... now I'm seeing the same build error as I did originally under Alacritty, for both make and make base.en:

❯ make base.en
Makefile:21: Your arch is announced as x86_64, but it seems to actually be ARM64. Not fixing that can lead to bad performance. For more info see: https://github.com/ggerganov/whisper.cpp/issues/66\#issuecomment-1282546789
gcc-11  -I.              -O3 -std=c11   -pthread -DGGML_USE_ACCELERATE   -c ggml.c -o ggml.o
In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Headers/../Frameworks/vecLib.framework/Headers/vecLib.h:25,
                 from /Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Headers/Accelerate.h:20,
                 from ggml.c:96:
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h: In function 'vU64Sub':
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h:658:3: note: use '-flax-vector-conversions' to permit conversions between vectors with differing element types or numbers of subparts
  658 |   vUInt32   __vbasicops_vB) { return vsubq_u64( (uint64x2_t)__vbasicops_vA, (uint64x2_t)__vbasicops_vB); }
      |   ^~~~~~~
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h:658:38: error: incompatible types when returning type 'uint64x2_t' but 'vUInt32' was expected
  658 |   vUInt32   __vbasicops_vB) { return vsubq_u64( (uint64x2_t)__vbasicops_vA, (uint64x2_t)__vbasicops_vB); }
      |                                      ^
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h: In function 'vS64Sub':
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h:733:38: error: incompatible types when returning type 'int64x2_' but 'vUInt32' was expected
  733 |   vUInt32   __vbasicops_vB) { return vsubq_s64( (int64x2_t)__vbasicops_vA, (int64x2_t)__vbasicops_vB); }
      |                                      ^
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h: In function 'vU64Add':
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h:805:38: error: incompatible types when returning type 'uint64x2_t' but 'vUInt32' was expected
  805 |   vUInt32   __vbasicops_vB) { return vaddq_u64( (uint64x2_t)__vbasicops_vA, (uint64x2_t)__vbasicops_vB); }
      |                                      ^
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h: In function 'vS64Add':
/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Headers/vBasicOps.h:875:38: error: incompatible types when returning type 'int64x2_' but 'vUInt32' was expected
  875 |   vSInt32   __vbasicops_vB) { return vaddq_s64( (int64x2_t)__vbasicops_vA, (int64x2_t)__vbasicops_vB); }
      |                                      ^
make: *** [Makefile:125: ggml.o] Error 1
zsh: exit 2     make base.en

I'm not super familiar with C++, but it seems like the compiler doesn't like something inside Accelerate.framework. Are there any debugging steps I can take to figure this out?

ggerganov commented 1 year ago

You can always build the main example manually by running these commands:

cc  -I.              -O3 -std=c11   -pthread -DGGML_USE_ACCELERATE   -c ggml.c -o ggml.o
c++ -I. -I./examples -O3 -std=c++11 -pthread -c whisper.cpp -o whisper.o
c++ -I. -I./examples -O3 -std=c++11 -pthread examples/main/main.cpp ggml.o whisper.o -o main  -framework Accelerate

This should work regardless of the terminal app that you use. Other than that, I am not sure what is the problem. It looks like some misconfiguration in your environment.

knpwrs commented 1 year ago

Those commands work. And the transcribing is brilliantly fast even on CPU (the original python implementation was painfully slow).

Awesome work on this!

While it would be great to have make working I'm going to end up building this in Docker and running it in Linux containers so I'll just go ahead and close this since it's not blocking me.

Thank you!

ggerganov commented 1 year ago

Just FYI - whisper.cpp is 2-3 times faster compared to Python only on ARM Apple Silicon CPUs. On x86 architectures it is only marginally faster, so don't expect the same performance on Linux.

knpwrs commented 1 year ago

Good to know. I'll have to try it out. What about ARM in the cloud, by chance? E.g., AWS Graviton.

ggerganov commented 1 year ago

Here are benchmarks for Graviton 3: https://github.com/ggerganov/whisper.cpp/issues/89#issuecomment-1322319512