intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
https://intel.github.io/neural-compressor/
Apache License 2.0
2.23k stars 257 forks source link

add online doc for 2.4, 2.5, 2.6, 3.0 #1976

Closed NeoZhangJianyu closed 3 months ago

NeoZhangJianyu commented 3 months ago

Type of Change

add online doc for 2.4, 2.5, 2.6, 3.0

Description

detail description

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed