huggingface / exporters

Export Hugging Face models to Core ML and TensorFlow Lite
Apache License 2.0
577 stars 35 forks source link

Error convert pytorch bert-small-uncased for text classification #27

Open dgilim opened 1 year ago

dgilim commented 1 year ago

Hello. I am trying to convert finetuned pytorch version of bert-small-uncased model to coreml one but getting the following error:

python -m exporters.coreml --model=./small_legal_bert --feature text-classification  exported/ 

Using framework PyTorch: 2.0.0
Overriding 1 configuration item(s)
        - use_cache -> False
Skipping token_type_ids input
Tuple detected at graph output. This will be flattened in the converted model.
Converting PyTorch Frontend ==> MIL Ops:   0%|                                                                                            | 0/345 [00:00<?, ? ops/s]Core ML embedding (gather) layer does not support any inputs besides the weights and indices. Those given will be ignored.
Converting PyTorch Frontend ==> MIL Ops:  99%|███████████████████████████████████████████████████████████████████████████████▌| 343/345 [00:00<00:00, 4742.81 ops/s]
Running MIL frontend_pytorch pipeline: 100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 948.04 passes/s]
Running MIL default pipeline:   0%|                                                                                                     | 0/56 [00:00<?, ? passes/s]/Users/dgilim/anaconda3/lib/python3.10/site-packages/coremltools/converters/mil/mil/passes/defs/preprocess.py:262: UserWarning: Output, '555', of the source model, has been renamed to 'var_555' in the Core ML model.
  warnings.warn(msg.format(var.name, new_name))
Running MIL default pipeline: 100%|███████████████████████████████████████████████████████████████████████████████████████████| 56/56 [00:00<00:00, 159.49 passes/s]
Running MIL backend_mlprogram pipeline: 100%|████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 1016.90 passes/s]
/Users/dgilim/anaconda3/lib/python3.10/site-packages/coremltools/models/model.py:146: RuntimeWarning: You will not be able to run predict() on this Core ML model. Underlying exception message was: Error compiling model: "Failed to parse the model specification. Error: Unable to parse ML Program: in operation of type classify: Classifier probabilities must have a fully known shape.".
  _warnings.warn(
Validating Core ML model...
Traceback (most recent call last):
  File "/Users/dgilim/anaconda3/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/dgilim/anaconda3/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/dgilim/Projects/exporters/src/exporters/coreml/__main__.py", line 175, in <module>
    main()
  File "/Users/dgilim/Projects/exporters/src/exporters/coreml/__main__.py", line 163, in main
    convert_model(
  File "/Users/dgilim/Projects/exporters/src/exporters/coreml/__main__.py", line 67, in convert_model
    validate_model_outputs(coreml_config, preprocessor, model, mlmodel, args.atol)
  File "/Users/dgilim/Projects/exporters/src/exporters/coreml/validate.py", line 108, in validate_model_outputs
    coreml_outputs = mlmodel.predict(coreml_inputs)
  File "/Users/dgilim/anaconda3/lib/python3.10/site-packages/coremltools/models/model.py", line 554, in predict
    raise self._framework_error
  File "/Users/dgilim/anaconda3/lib/python3.10/site-packages/coremltools/models/model.py", line 144, in _get_proxy_and_spec
    return _MLModelProxy(filename, compute_units.name), specification, None
RuntimeError: Error compiling model: "Failed to parse the model specification. Error: Unable to parse ML Program: in operation of type classify: Classifier probabilities must have a fully known shape.".

Also attaching config.json from the model:

{
  "_name_or_path": "nlpaueb/legal-bert-small-uncased",
  "architectures": [
    "BertForSequenceClassification"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "classifier_dropout": null,
  "eos_token_ids": 0,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 512,
  "initializer_range": 0.02,
  "intermediate_size": 2048,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_labels": 2,
  "num_attention_heads": 8,
  "num_hidden_layers": 6,
  "output_past": true,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "problem_type": "single_label_classification",
  "torch_dtype": "float32",
  "transformers_version": "4.28.1",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
}
hollance commented 1 year ago

@pcuenca Could this be related to the removal of the code that sets the output shape? The error says:

Classifier probabilities must have a fully known shape

This might have been the reason why I originally added that code to fill in the shapes (even though Apple says it's the wrong way to do it).

pcuenca commented 1 year ago

I just tested commit 0588d7cab707a8952c4433137ec70a49d864fe2b and the same error occurs. Trying to debug.