Open ArtemBernatskyy opened 1 year ago
Whisper.cpp now has CoreML support:
https://github.com/ggerganov/whisper.cpp/pull/566
Using just with whisper.cpp, should be as simple as compiling with the appropriate flags:
cd build
cmake -DWHISPER_COREML=1 ..
Check by running:
./main -m models/ggml-base.en.bin -f samples/gb0.wav
...
whisper_init_state: loading Core ML model from 'models/ggml-base.en-encoder.mlmodelc'
whisper_init_state: first run on a device may take a while ...
whisper_init_state: Core ML model loaded
system_info: n_threads = 4 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | COREML = 1 |
...
note: COREML = 1
.
For whispercpp.py
, we can add the flag for CoreML
here inside setup.py
:
if sys.platform == 'darwin':
os.environ['CFLAGS'] = '-DWHISPER_COREML=1 -DGGML_USE_ACCELERATE -O3 -std=gnu11'
os.environ['CXXFLAGS'] = '-DWHISPER_COREML=1 -DGGML_USE_ACCELERATE -O3 -std=c++11'
os.environ['LDFLAGS'] = '-framework Accelerate'
First update the submodule inside whispercpp.py for whisper.cpp. Check that it still runs, it might need some changes if the API has changed. Given it still works, add the flag inside setup.py.
I can't test this at the moment, but feel free to make the pull request, and we can get this feature added.
Thx! I decided to use OpenAI's Whisper API for the current moment, in my tests it beats the local Whisper with CoreML by 3-4 times (comparing to Macbook M1 32GB)
Whisper.cpp now has CoreML support:
Using just with whisper.cpp, should be as simple as compiling with the appropriate flags:
cd build cmake -DWHISPER_COREML=1 ..
Check by running:
./main -m models/ggml-base.en.bin -f samples/gb0.wav ... whisper_init_state: loading Core ML model from 'models/ggml-base.en-encoder.mlmodelc' whisper_init_state: first run on a device may take a while ... whisper_init_state: Core ML model loaded system_info: n_threads = 4 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | COREML = 1 | ...
note:
COREML = 1
.For
whispercpp.py
, we can add the flag forCoreML
here insidesetup.py
:if sys.platform == 'darwin': os.environ['CFLAGS'] = '-DWHISPER_COREML=1 -DGGML_USE_ACCELERATE -O3 -std=gnu11' os.environ['CXXFLAGS'] = '-DWHISPER_COREML=1 -DGGML_USE_ACCELERATE -O3 -std=c++11' os.environ['LDFLAGS'] = '-framework Accelerate'
First update the submodule inside whispercpp.py for whisper.cpp. Check that it still runs, it might need some changes if the API has changed. Given it still works, add the flag inside setup.py.
I can't test this at the moment, but feel free to make the pull request, and we can get this feature added. @stlukey
@stlukey
I have verified that my computer is an M2. I found that CoreML does not seem to be enabled through this command.
I also added this compile flag, which also does not seem to work:
if sys.platform == 'darwin':
print("run here.....")
os.environ['CFLAGS'] = '-DWHISPER_COREML=1 -DGGML_USE_ACCELERATE -O3 -std=gnu11'
os.environ['CXXFLAGS'] = '-DWHISPER_COREML=1 -DGGML_USE_ACCELERATE -O3 -std=c++11'
os.environ['LDFLAGS'] = '-framework Accelerate'
That is, I added -DWHISPER_COREML=1
and used the latest whisper.cpp code. When I run the generated whisper.xxxx.so file to transcribe the same audio, it takes 12 minutes. But if I compile whisper.cpp with the same commit using cmake -DWHISPER_COREML=1
and run ./main
on the same audio, it only takes 7 minutes. Also, I can see the loading process has:
whisper_init_state: loading Core ML model from 'models/ggml-large-encoder.mlmodelc'
whisper_init_state: first run on a device may take a while ...
whisper_init_state: Core ML model loaded
system_info: n_threads = 4 / 12 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | COREML = 1 | OPENVINO = 0 |
It loading the Core ML model.
andIt only takes 8 minutes. I expected the .so call can also do this. How can I modify it?
How can we add CoreML support? Thx!