Onnx quantized backend for Clip-ViT-B-16

UKPLab / sentence-transformers

State-of-the-Art Text Embeddings

https://www.sbert.net

Apache License 2.0

15.48k stars 2.5k forks source link

Onnx quantized backend for Clip-ViT-B-16 #3006

Open PraNavKumAr01 opened 1 month ago

PraNavKumAr01 commented 1 month ago

@tomaarsen Just wanted to know if clip (text + image) embedding models will have an onnx quantized model? i tried finding it everywhere but had no luck. If it is there can you please point me to it? And if not, is it possible to create a model_qint8_avx512_vnni.onnx for it. Can we expect it to be there with future updates or would i have to run some experiments and convert it on my own?

tomaarsen commented 1 month ago

Hello!

I'm afraid that CLIP models don't have ONNX support in Sentence Transformers right now. In short, CLIP models are loaded with the CLIPModel module, whereas the ONNX support is implemented in the Transformer module. This latter module is used for text-based embedding models.

Tom Aarsen