intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
https://intel.github.io/neural-compressor/
Apache License 2.0
2.18k stars 252 forks source link

bump main version into v3.1 #1974

Closed chensuyue closed 1 month ago

chensuyue commented 1 month ago

Type of Change

bump main version into v3.1

Description

bump main version into v3.1

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed