Open khromov opened 8 months ago
Hi there 👋 Yes, I noticed this recently when converting recent sentence-transformers models with Optimum, and it should be fixed by this PR. In the meantime, you can use the already-converted version of sentence-transformers/paraphrase-multilingual-mpnet-base-v2
, which is Xenova/paraphrase-multilingual-mpnet-base-v2
(link).
System Info
Environment/Platform
Description
After trying to use my own quantized model, I'm getting this error:
Reproduction
First I'm converting an existing model to ONNX
Then I'm using the HuggingFace embedding from Llamaindex (although the pipeline is set up by transformers directly):
The HuggingFaceEmbedding class is just a thin wrapper around transformers.
The above code works fine with existing models on HF.