huggingface / optimum-intel

🤗 Optimum Intel: Accelerate inference with Intel optimization tools
https://huggingface.co/docs/optimum/main/en/intel/index
Apache License 2.0
364 stars 101 forks source link

Increase default 4-bit compression ratio from 0.8 to 1.0 #805

Closed nikita-savelyevv closed 3 weeks ago

nikita-savelyevv commented 3 weeks ago

This PR aligns default 4-bit weight compression parameters between optimum-intel and opengino.genai repositories. The default parameters are: bits=4, sym=False, ratio=1.0, group_size=128. This will be applied when there is no custom compression recipe for the given model id.

Corresponding PR to openvino.genai: https://github.com/openvinotoolkit/openvino.genai/pull/577

Before submitting

nikita-savelyevv commented 3 weeks ago

@AlexKoff88 proposed to raise default compression ratio from 0.8 to 1.0

cc @MaximProshin @eaidova

HuggingFaceDocBuilderDev commented 3 weeks ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

echarlaix commented 3 weeks ago

To fix the code style check you can do the following :

pip install .[quality]
make style
nikita-savelyevv commented 3 weeks ago

To fix the code style check you can do the following :

pip install .[quality]
make style

Thanks! Actually, for quite some time I had troubles with make style command. For some reason it changes many files in the repo, even those that I didn't change. Possibly, I need to change some local configs. Do you have any suggestions?

echarlaix commented 3 weeks ago

Thanks! Actually, for quite some time I had troubles with make style command. For some reason it changes many files in the repo, even those that I didn't change. Possibly, I need to change some local configs. Do you have any suggestions?

Could it be a mismatch in the black or ruff version ? I think it could make sense to pin a specific version on our side, will open a PR for this

echarlaix commented 3 weeks ago

Will merge once the tests pass, I think only TEST_4BIT_CONFIGURATONS needs to be udpated https://github.com/huggingface/optimum-intel/blob/48cc82aa361beb58cd690775b400702a3de1421b/tests/openvino/test_exporters_cli.py#L88