ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
35.68k stars 3.63k forks source link

Cannot build CoreML models #898

Open servin opened 1 year ago

servin commented 1 year ago

Core ml models, never finish on my M1 Pro, finally I've got an error, I couldn't find any relevant information about this on the repo

xcrun: error: unable to find utility "coremlc"

tried restoring Xcode tools an verify that this dependency is available `(venv10) servin@192 whisper.cpp % /Applications/Xcode.app/Contents/Developer/usr/bin/coremlc

coremlc: error: usage: coremlc [options ...]`

servin commented 1 year ago

I was able to build the base model so my theory is that the device isn't meeting the minimum requirements even though the mem usage isn't that bad

` Converting PyTorch Frontend ==> MIL Ops: 100%|███████████████████████████████▉| 2611/2612 [00:00<00:00, 3147.78 ops/s] Running MIL frontend_pytorch pipeline: 100%|████████████████████████████████| 5/5 [00:00<00:00, 31.62 passes/s] Running MIL default pipeline: 100%|█████████████████████████████| 57/57 [00:16<00:00, 3.40 passes/s] Running MIL backend_mlprogram pipeline: 100%|████████████████████████████| 10/10 [00:00<00:00, 336.03 passes/s]

`

Screenshot 2023-05-09 at 9 07 50
servin commented 1 year ago

found a clue

coremltools 6.3.0 requires protobuf<=4.0.0,>=3.1.0, but you have protobuf 4.22.3 which is incompatible.

stsydow commented 1 year ago

I found that coremltools==6.3.0 does not support python 3.11 yet.

And I did the pip install in one go to help dependency resolution:

pip3.10 install openai-whisper coremltools ane-transformers --force-reinstall

and for conversion, change the models/generate-coreml-model.sh script line 16 to python 3.10:

python3.10 models/convert-whisper-to-coreml.py --model $mname --encoder-only True
CodyBontecou commented 1 year ago

Hmm, I believe I'm running into the same issue. I've attempted @stsydow solution with no luck.

I'm running this through a conda environment with python 3.10.11.

Screenshot 2023-05-18 at 9 03 20 PM

EDIT: Nevermind, I found this thread with the solution. It required me to run sudo xcode-select --switch /Applications/Xcode.app/Contents/Developer

ajitsinghkaler commented 1 year ago

Getting the same error and coremlc is in path I'm using python 3.10 I'm able to generate base.en model but not other models not even tiny, tiny.en I already tried the above solutions those did not work

ajitsingh@192 bin % ./models/generate-coreml-model.sh base
zsh: no such file or directory: ./models/generate-coreml-model.sh ajitsingh@192 bin % cd ~/Documents/whisper.cpp ajitsingh@192 whisper.cpp % ./models/generate-coreml-model.sh base /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/whisper/timing.py:58: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details. def backtrace(trace: np.ndarray): Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1348, in do_open h.request(req.get_method(), req.selector, req.data, headers, File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1283, in request self._send_request(method, url, body, headers, encode_chunked) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1329, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1278, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1038, in _send_output self.send(msg) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 976, in send self.connect() File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1455, in connect self.sock = self._context.wrap_socket(self.sock, File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 513, in wrap_socket return self.sslsocket_class._create( File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 1071, in _create self.do_handshake() File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 1342, in do_handshake self._sslobj.do_handshake() ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1007)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/ajitsingh/Documents/whisper.cpp/models/convert-whisper-to-coreml.py", line 308, in whisper = load_model(args.model).cpu() File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/whisper/init.py", line 131, in load_model checkpoint_file = _download(_MODELS[name], download_root, in_memory) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/whisper/init.py", line 67, in _download with urllib.request.urlopen(url) as source, open(download_target, "wb") as output: File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 216, in urlopen return opener.open(url, data, timeout) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 519, in open response = self._open(req, data) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 536, in _open result = self._call_chain(self.handle_open, protocol, protocol + File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 496, in _call_chain result = func(*args) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1391, in https_open return self.do_open(http.client.HTTPSConnection, req, File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1351, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1007)> coremlc: error: Model does not exist at models/coreml-encoder-base.mlpackage -- file:///Users/ajitsingh/Documents/whisper.cpp/ mv: rename models/coreml-encoder-base.mlmodelc to models/ggml-base-encoder.mlmodelc: No such file or directory

ajitsingh@192 whisper.cpp % ./models/generate-coreml-model.sh base.en /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/whisper/timing.py:58: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details. def backtrace(trace: np.ndarray): ModelDimensions(n_mels=80, n_audio_ctx=1500, n_audio_state=512, n_audio_head=8, n_audio_layer=6, n_vocab=51864, n_text_ctx=448, n_text_state=512, n_text_head=8, n_text_layer=6) /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/whisper/model.py:166: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert x.shape[1:] == self.positional_embedding.shape, "incorrect audio shape" /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/whisper/model.py:97: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). scale = (n_state // self.n_head) ** -0.25 Converting PyTorch Frontend ==> MIL Ops: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊| 531/532 [00:00<00:00, 5696.09 ops/s] Running MIL frontend_pytorch pipeline: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 586.57 passes/s] Running MIL default pipeline: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [00:00<00:00, 83.20 passes/s] Running MIL backend_mlprogram pipeline: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 1933.57 passes/s] done converting /Users/ajitsingh/Documents/whisper.cpp/models/coreml-encoder-base.en.mlmodelc/coremldata.bin models/coreml-encoder-base.en.mlmodelc -> models/ggml-base.en-encoder.mlmodelc

ajitsinghkaler commented 1 year ago

Solved it by using a virtualenv guess some global dependency was meddling with the install

mliradelc commented 1 year ago

Hmm, I believe I'm running into the same issue. I've attempted @stsydow solution with no luck.

I'm running this through a conda environment with python 3.10.11.

Screenshot 2023-05-18 at 9 03 20 PM

EDIT: Nevermind, I found this thread with the solution. It required me to run sudo xcode-select --switch /Applications/Xcode.app/Contents/Developer

Thanks, I have to add before to run that line you need to have xcode from the app store installed.