bump main version into v3.1

intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

https://intel.github.io/neural-compressor/

Apache License 2.0

2.18k stars 252 forks source link

Closed chensuyue closed 1 month ago

chensuyue commented 1 month ago

bump main version into v3.1

bump main version into v3.1

the expected behavior that triggered by this PR

how to reproduce the test (including hardware information)

any library dependency introduced or removed