Open aoezdTchibo opened 1 month ago
Hello!
The former is an issue with Optimum it seems. I've reported it here: https://github.com/huggingface/optimum/issues/2062
In short, the token_type_ids
are not returned by the tokenizer, as it's optional in transformers
. But for BERT models, it's mandatory in optimum
.
The second one I'm not familiar with. I also can't reproduce that one, but it seems that you're using the CoreMLExecutionProvider
(by default), an execution provider that I'm not familiar with. Could you perhaps try it with:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2", backend="onnx", model_kwargs={"provider": "CPUExecutionProvider"})
sentences = ["This is an example sentence", "Each sentence is converted"]
embeddings = model.encode(sentences)
I'm adding testing of the top 10 SBERT multi-lingual models from MTEB, several issues came out:
# top 10 MTEB leaderboard multilingual SBERT embeddings models
models = [
'BAAI/bge-multilingual-gemma2',
'intfloat/multilingual-e5-large-instruct',
'HIT-TMG/KaLM-embedding-multilingual-mini-v1',
'gte-multilingual-base',
'Alibaba-NLP/gte-multilingual-base',
'intfloat/multilingual-e5-base',
'intfloat/multilingual-e5-small'
]
for model_name in models:
try:
model = SentenceTransformer (model_name, backend="onnx",
model_kwargs={
"provider": "CPUExecutionProvider",
# not supported with onnx
#"torch_dtype": torch.float16
},
trust_remote_code=True,
cache_folder='/mnt/datasets/sbert')
'''
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: ORTModelForFeatureExtraction
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
'''
print(model)
sentences = ["This is an example sentence", "Each sentence is converted" ]
embeddings = model.encode (sentences)
print( embeddings.shape() )
except Exception as e:
print(f'error loading {model_name} {str(e)}')
Stacktrace
BAAI/bge-multilingual-gemma2
No 'model.onnx' found in 'BAAI/bge-multilingual-gemma2'. Exporting the model to ONNX.
Loading checkpoint shards: 100%
error loading BAAI/bge-multilingual-gemma2 Trying to export a gemma2 model, that is a custom or unsupported architecture, but no custom onnx configuration was passed as `custom_onnx_configs`. Please refer to https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/export_a_model#custom-export-of-transformers-models for an example on how to export custom models. Please open an issue at https://github.com/huggingface/optimum/issues if you would like the model type gemma2 to be supported natively in the ONNX export.
intfloat/multilingual-e5-large-instruct
error loading intfloat/multilingual-e5-large-instruct 'tuple' object is not callable
No 'model.onnx' found in 'HIT-TMG/KaLM-embedding-multilingual-mini-v1'. Exporting the model to ONNX.
tokenization_qwen.py: 100%|
A new version of the following files was downloaded from https://huggingface.co/HIT-TMG/KaLM-embedding-multilingual-mini-v1:
tokenization_qwen.py . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision. A new version of the following files was downloaded from https://huggingface.co/HIT-TMG/KaLM-embedding-multilingual-mini-v1:
tokenization_qwen.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
/opt/conda/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py:103: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if sequence_length != 1:
Saving the exported ONNX model is heavily recommended to avoid having to export it again. Do so with model.push_to_hub('HIT-TMG/KaLM-embedding-multilingual-mini-v1', create_pr=True)
.
tokenization_qwen.py: 100%
A new version of the following files was downloaded from https://huggingface.co/HIT-TMG/KaLM-embedding-multilingual-mini-v1:
tokenization_qwen.py . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
HIT-TMG/KaLM-embedding-multilingual-mini-v1
error loading HIT-TMG/KaLM-embedding-multilingual-mini-v1 'position_ids'
No sentence-transformers model found with name sentence-transformers/gte-multilingual-base. Creating a new one with mean pooling.
error loading gte-multilingual-base sentence-transformers/gte-multilingual-base is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`
A new version of the following files was downloaded from https://huggingface.co/Alibaba-NLP/new-impl:
configuration.py . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision. No 'model.onnx' found in 'Alibaba-NLP/gte-multilingual-base'. Exporting the model to ONNX.
A new version of the following files was downloaded from https://huggingface.co/Alibaba-NLP/new-impl:
Some weights of the model checkpoint at Alibaba-NLP/gte-multilingual-base were not used when initializing NewModel: ['classifier.bias', 'classifier.weight']
This IS expected if you are initializing NewModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing NewModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). tokenizer_config.json: 100%| A new version of the following files was downloaded from https://huggingface.co/Alibaba-NLP/new-impl:
configuration.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
error loading Alibaba-NLP/gte-multilingual-base Trying to export a new model, that is a custom or unsupported architecture, but no custom onnx configuration was passed as custom_onnx_configs
. Please refer to https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/export_a_model#custom-export-of-transformers-models for an example on how to export custom models. Please open an issue at https://github.com/huggingface/optimum/issues if you would like the model type new to be supported natively in the ONNX export.
intfloat/multilingual-e5-base
error loading intfloat/multilingual-e5-base 'tuple' object is not callable
intfloat/multilingual-e5-small
error loading intfloat/multilingual-e5-small 'NoneType' object has no attribute 'numpy'
With the latest optimum
and the upcoming Sentence Transformers v3.3.0, these should work again:
Some of the others listed in https://github.com/UKPLab/sentence-transformers/issues/2983#issuecomment-2423936925 are "expected" failures I believe, as they're e.g. custom architectures or novel architectures not integrated with optimum
. I can't do too much about those.
With the new release of version 3.2.0, the use of ONNX has become much easier but initial local tests led to various errors, meaning that it was not possible to use ONNX Runtime via Sentence Transformers. See these two examples:
intfloat/multilingual-e5-small
Lead to following error:
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
Lead to following error:
Local environment: python=3.10 sentence-transformers=3.2.0 onnx=1.17.0 onnxruntime=1.19.2 optimum=1.23.0