huggingface / optimum-intel

🤗 Optimum Intel: Accelerate inference with Intel optimization tools
https://huggingface.co/docs/optimum/main/en/intel/index
Apache License 2.0
364 stars 101 forks source link

apply sdpa for mpt and internlm #676

Closed eaidova closed 3 months ago

eaidova commented 3 months ago

What does this PR do?

optimize mpt and internlm models with scaled dot product attention fixed export baichuan-13b model

Before submitting

HuggingFaceDocBuilderDev commented 3 months ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

eaidova commented 3 months ago

@echarlaix could you please take a look?

AlexKoff88 commented 3 months ago

Can we have a test for each model architecture that is updated in this PR?

eaidova commented 3 months ago

Can we have a test for each model architecture that is updated in this PR?

this is update for models that are already in testing, I added only baichuan based on different code version in tests, mpt and internlm are remain without changes