CoreML: MLModel of type mlProgram cannot be loaded just from the model spec object.

loretoparisi commented 10 months ago

I get this CoreML error when running conversion with quantization:

python3 models/convert-whisper-to-coreml.py --model tiny.en --encoder-only True --quantize True

Stacktrace:

loretoparisi@Loretos-MBP whisper.cpp % python3 models/convert-whisper-to-coreml.py --model tiny.en --encoder-only True --quantize True
scikit-learn version 1.2.2 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
100%|█████████████████████████████████████| 72.1M/72.1M [00:12<00:00, 5.91MiB/s]
ModelDimensions(n_mels=80, n_audio_ctx=1500, n_audio_state=384, n_audio_head=6, n_audio_layer=4, n_vocab=51864, n_text_ctx=448, n_text_state=384, n_text_head=6, n_text_layer=4)
/opt/homebrew/lib/python3.11/site-packages/whisper/model.py:166: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert x.shape[1:] == self.positional_embedding.shape, "incorrect audio shape"
When both 'convert_to' and 'minimum_deployment_target' not specified, 'convert_to' is set to "mlprogram" and 'minimum_deployment_targer' is set to ct.target.iOS15 (which is same as ct.target.macOS12). Note: the model will not run on systems older than iOS15/macOS12/watchOS8/tvOS15. In order to make your model run on older system, please set the 'minimum_deployment_target' to iOS14/iOS13. Details please see the link: https://coremltools.readme.io/docs/unified-conversion-api#target-conversion-formats
Converting PyTorch Frontend ==> MIL Ops: 100%|████████████████████████████████████████████████████████████████████████████████▊| 369/370 [00:00<00:00, 11331.47 ops/s]
Running MIL frontend_pytorch pipeline: 100%|█████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 1017.10 passes/s]
Running MIL default pipeline: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 71/71 [00:00<00:00, 194.74 passes/s]
Running MIL backend_mlprogram pipeline: 100%|██████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 2283.55 passes/s]
Quantizing using linear quantization
Traceback (most recent call last):
  File "/Users/loretoparisi/Documents/Projects/whisper.cpp/models/convert-whisper-to-coreml.py", line 323, in <module>
    encoder = convert_encoder(hparams, encoder, quantize=args.quantize)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/loretoparisi/Documents/Projects/whisper.cpp/models/convert-whisper-to-coreml.py", line 268, in convert_encoder
    model = quantize_weights(model, nbits=16)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/coremltools/models/neural_network/quantization_utils.py", line 1683, in quantize_weights
    quantized_model = _get_model(qspec, compute_units=full_precision_model.compute_unit)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/coremltools/models/utils.py", line 345, in _get_model
    return MLModel(spec, compute_units=compute_units)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/coremltools/models/model.py", line 388, in __init__
    raise Exception(
Exception: MLModel of type mlProgram cannot be loaded just from the model spec object. It also needs the path to the weights file. Please provide that as well, using the 'weights_dir' argument.

while if I run it without quantization it works fine:

loretoparisi@Loretos-MBP whisper.cpp % python3 models/convert-whisper-to-coreml.py --model tiny.en --encoder-only True                
scikit-learn version 1.2.2 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
ModelDimensions(n_mels=80, n_audio_ctx=1500, n_audio_state=384, n_audio_head=6, n_audio_layer=4, n_vocab=51864, n_text_ctx=448, n_text_state=384, n_text_head=6, n_text_layer=4)
/opt/homebrew/lib/python3.11/site-packages/whisper/model.py:166: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert x.shape[1:] == self.positional_embedding.shape, "incorrect audio shape"
Converting PyTorch Frontend ==> MIL Ops: 100%|█████████████████████████████████████████████████████████████████████████████████▊| 369/370 [00:00<00:00, 9451.07 ops/s]
Running MIL frontend_pytorch pipeline: 100%|██████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 981.54 passes/s]
Running MIL default pipeline: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 71/71 [00:00<00:00, 180.48 passes/s]
Running MIL backend_mlprogram pipeline: 100%|██████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 2425.74 passes/s]
done converting

loretoparisi commented 10 months ago

[UPDATE] The issue happens even if I pass --quantize False

% python3 models/convert-whisper-to-coreml.py --model large-v3 --encoder-only True --quantize False
scikit-learn version 1.2.2 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
ModelDimensions(n_mels=128, n_audio_ctx=1500, n_audio_state=1280, n_audio_head=20, n_audio_layer=32, n_vocab=51866, n_text_ctx=448, n_text_state=1280, n_text_head=20, n_text_layer=32)
/opt/homebrew/lib/python3.11/site-packages/whisper/model.py:166: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert x.shape[1:] == self.positional_embedding.shape, "incorrect audio shape"
When both 'convert_to' and 'minimum_deployment_target' not specified, 'convert_to' is set to "mlprogram" and 'minimum_deployment_targer' is set to ct.target.iOS15 (which is same as ct.target.macOS12). Note: the model will not run on systems older than iOS15/macOS12/watchOS8/tvOS15. In order to make your model run on older system, please set the 'minimum_deployment_target' to iOS14/iOS13. Details please see the link: https://coremltools.readme.io/docs/unified-conversion-api#target-conversion-formats
Converting PyTorch Frontend ==> MIL Ops: 100%|███████████████████████████████████████████████████████████████████████████████▉| 2613/2614 [00:00<00:00, 3314.04 ops/s]
Running MIL frontend_pytorch pipeline: 100%|███████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 29.82 passes/s]
Running MIL default pipeline: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 71/71 [00:19<00:00,  3.68 passes/s]
Running MIL backend_mlprogram pipeline: 100%|███████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 280.18 passes/s]
Quantizing using linear quantization
Traceback (most recent call last):
  File "/Users/loretoparisi/Documents/Projects/whisper.cpp/models/convert-whisper-to-coreml.py", line 323, in <module>
    encoder = convert_encoder(hparams, encoder, quantize=args.quantize)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/loretoparisi/Documents/Projects/whisper.cpp/models/convert-whisper-to-coreml.py", line 268, in convert_encoder
    model = quantize_weights(model, nbits=16)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/coremltools/models/neural_network/quantization_utils.py", line 1683, in quantize_weights
    quantized_model = _get_model(qspec, compute_units=full_precision_model.compute_unit)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/coremltools/models/utils.py", line 345, in _get_model
    return MLModel(spec, compute_units=compute_units)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/coremltools/models/model.py", line 388, in __init__
    raise Exception(
Exception: MLModel of type mlProgram cannot be loaded just from the model spec object. It also needs the path to the weights file. Please provide that as well, using the 'weights_dir' argument.

loretoparisi commented 10 months ago

[UPDATE]

Even if I had the conversion completed

python3 models/convert-whisper-to-coreml.py --model large-v3 --encoder-only True                 
scikit-learn version 1.2.2 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
ModelDimensions(n_mels=128, n_audio_ctx=1500, n_audio_state=1280, n_audio_head=20, n_audio_layer=32, n_vocab=51866, n_text_ctx=448, n_text_state=1280, n_text_head=20, n_text_layer=32)
/opt/homebrew/lib/python3.11/site-packages/whisper/model.py:166: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert x.shape[1:] == self.positional_embedding.shape, "incorrect audio shape"
Converting PyTorch Frontend ==> MIL Ops: 100%|███████████████████████████████████████████████████████████████████████████████▉| 2613/2614 [00:00<00:00, 3371.26 ops/s]
Running MIL frontend_pytorch pipeline: 100%|███████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 30.15 passes/s]
Running MIL default pipeline: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 71/71 [00:18<00:00,  3.82 passes/s]
Running MIL backend_mlprogram pipeline: 100%|███████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 299.13 passes/s]
done converting

it seems that files are empty:

 % ls -l models/coreml-encoder-large-v3.mlpackage
total 8
drwxr-xr-x  3 loretoparisi  staff   96 Nov 16 11:45 Data
-rw-r--r--  1 loretoparisi  staff  617 Nov 16 11:45 Manifest.json
 % ls -l models/coreml-encoder-tiny.en.mlpackage 
total 8
drwxr-xr-x  3 loretoparisi  staff   96 Nov 16 00:43 Data
-rw-r--r--  1 loretoparisi  staff  617 Nov 16 00:43 Manifest.json

loretoparisi commented 10 months ago

@ggerganov any idea?

ggerganov commented 10 months ago

Not sure why it fails - I have very basic understanding of the CoreML stuff. Probably somebody with more expertise can help out

seungjun-green commented 8 months ago

Having a same issue still today

mgrachten commented 5 months ago

I ran into this error today using an coreml export script for my own model. The script used to run fine on an older version of coremltools. The issue in my case was that in the oldv version the coreml model type was "neuralnetwork" (without setting it explicitly), but in the new version the model type was "mlprogram". Setting convert_to="neuralnetwork" when calling convert solved the issue for me. HTH

ggerganov / whisper.cpp

CoreML: MLModel of type mlProgram cannot be loaded just from the model spec object. #1495