Open inisis opened 1 week ago
I would love that! 🔥 If you'd like, you can integrate it into the v3 conversion script: https://github.com/xenova/transformers.js/blob/v3/scripts/convert.py. There are many improvements to make there, and a huge refactor is certainly needed. Let me know if that's something you'd be interested in!
@xenova I started from following README,
python -m scripts.convert --quantize --model_id bert-base-uncased
and I find that after optimum export, we can use onnxslim to further optimize it, here are the slimmed result.
and the quanzation works fine, so I wonder if you have benchmark test on CI or how can I test the slimmed performance.
Hi @xenova
I found your repo on huggingface and I have tested some models manually,
like esm2_t30_150M_UR50D
and Phi-3-mini-4k-instruct-onnx-web
the results are promising.
Feature request
Thank you for your awesome project, can onnxslim be in your pipeline, there are still some space for onnx optimization after Optimum export.
Motivation
I believe there are still some performance increase after onnxslim
Your contribution
I can submit a pr with onnxslim