Fix nncf quantization for decoder models

huggingface / optimum-intel

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

https://huggingface.co/docs/optimum/main/en/intel/index

Apache License 2.0

358 stars 99 forks source link

Fix nncf quantization for decoder models #727

Closed echarlaix closed 1 month ago

echarlaix commented 1 month ago

Added a fix to be able to quantize instances of OVModelForCausalLM using the quantizer

cc @nikita-savelyevv

HuggingFaceDocBuilderDev commented 1 month ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.