xenova / transformers.js

State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!
https://huggingface.co/docs/transformers.js
Apache License 2.0
9.48k stars 558 forks source link

Awesome project #797

Open inisis opened 1 week ago

inisis commented 1 week ago

Feature request

Thank you for your awesome project, can onnxslim be in your pipeline, there are still some space for onnx optimization after Optimum export.

Motivation

I believe there are still some performance increase after onnxslim

Your contribution

I can submit a pr with onnxslim

xenova commented 1 week ago

I would love that! 🔥 If you'd like, you can integrate it into the v3 conversion script: https://github.com/xenova/transformers.js/blob/v3/scripts/convert.py. There are many improvements to make there, and a huge refactor is certainly needed. Let me know if that's something you'd be interested in!

inisis commented 5 days ago

@xenova I started from following README,

python -m scripts.convert --quantize --model_id bert-base-uncased

and I find that after optimum export, we can use onnxslim to further optimize it, here are the slimmed result. image

and the quanzation works fine, so I wonder if you have benchmark test on CI or how can I test the slimmed performance.

inisis commented 2 days ago

Hi @xenova

I found your repo on huggingface and I have tested some models manually,

like esm2_t30_150M_UR50D

image

and Phi-3-mini-4k-instruct-onnx-web image

the results are promising.