intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

https://intel.github.io/neural-compressor/

Apache License 2.0

2.18k stars 252 forks source link

update documentation for 3x API #1923

Closed chensuyue closed 2 months ago

chensuyue commented 2 months ago

Type of Change

documentation

Description

[x] Main page docs matrix, @yiliu30 please update description for docs/3x/design.md.
[ ] ~~Main page get started examples, @yiliu30~~ Replace auto-round example with fp8 @xin3he in another PR.
[x] 2.x feature overview md, 2x_user_guide.md.
[ ] Installation Guide update for 3x API, will be submitted in other PR.
[x] docs for IO page, get_started.md.

github-actions[bot] commented 2 months ago

⚡ Required checks status: All passing 🟢

No groups match the files changed in this PR.

Thank you for your contribution! 💜

Note This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact chensuyue or XuehaoSun for help.