Add woq examples - Githubissues

intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

https://intel.github.io/neural-compressor/

Apache License 2.0

2.18k stars 252 forks source link

Open Kaihui-intel opened 1 month ago

Kaihui-intel commented 1 month ago

feature

AWQ/TEQ/AutoRound

the expected behavior that triggered by this PR

how to reproduce the test (including hardware information)

any library dependency introduced or removed