huggingface / optimum-intel

🤗 Optimum Intel: Accelerate inference with Intel optimization tools
https://huggingface.co/docs/optimum/main/en/intel/index
Apache License 2.0
409 stars 112 forks source link

Compress VLM model components to int8_sym instead of int8_asym #1002

Closed nikita-savelyevv closed 6 days ago

nikita-savelyevv commented 1 week ago

What does this PR do?

As in the title. Preparation for DQ for ViTs in the next OV release.

Before submitting

HuggingFaceDocBuilderDev commented 1 week ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

AlexKoff88 commented 6 days ago

@nikita-savelyevv, please fix tests.