huggingface / setfit

Efficient few-shot learning with Sentence Transformers
https://hf.co/docs/setfit
Apache License 2.0
2.24k stars 223 forks source link

`optimum-intel` notebook #480

Closed danielkorat closed 9 months ago

danielkorat commented 10 months ago

Hi @tomaarsen 👋

This PR adds a notebook that demonstrates how to accelerate SetFit models using optimum-intel and achieve 3.3x latency speedup (bs=1) and 3x-4x throughput increase without any accuracy drop. Specifically, it applies static 8-bit quantization using INC to the SetFit model body.

image

review-notebook-app[bot] commented 10 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

HuggingFaceDocBuilderDev commented 10 months ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

tomaarsen commented 9 months ago

We can ignore the failing test - it also occurs on main, and I'm unable to reproduce it for now.

tomaarsen commented 9 months ago

Nice changes!

Some extra comments: