huggingface / optimum-intel

🤗 Optimum Intel: Accelerate inference with Intel optimization tools
https://huggingface.co/docs/optimum/main/en/intel/index
Apache License 2.0
355 stars 99 forks source link

Patch fusion linear for bert and vit #786

Closed jiqing-feng closed 20 hours ago

jiqing-feng commented 1 week ago

Hi @echarlaix . I added 2 more supported patched models: Bert and Vit. I only use the linear fusion module to optimize these 2 models; it will bring at least a 10% speed-up on SPR with little code addition.

Would you please take a review on these changes? Thx!

HuggingFaceDocBuilderDev commented 3 days ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

jiqing-feng commented 2 days ago

Hi @echarlaix , I will get some patched models and upload them for testing.

Could you take a look at the failed tests? You will find that we could pass the test when running individually but fail when running together.

That's mainly because of the file system state, do you have any suggestions for it? I searched online and found that @pytest.fixture may works, but have no clue on how to use it. If you think it is acceptable, you can give me some instructions and I will fix it. If you have better idea, please let me know. Thx!

jiqing-feng commented 2 days ago

Hi @echarlaix. I have changed the Ipex version check to make sure it will not have an impact on no patching path and also avoid duplicate codes. I also uploaded 2 models: patched_tiny_random_bert_for_question_answering and patched_tiny_random_vit_for_image_classification for test patching models. Could you please take a review? Thx!

BTW, if you need more pached models for all tasks, please let me know. Thx!

jiqing-feng commented 1 day ago

Hi @echarlaix . For the failed tests, the real issue comes from enable_tpp. It will change some environment variables and trigger the traced model check here, so the tests will fail when running together.

I have fixed them by change another way to check if the traced model has been patched or not. Please take a review, thx!