intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
https://intel.github.io/neural-compressor/
Apache License 2.0
2.18k stars 252 forks source link

[Doc] Add autoround EMNLP24 paper to pub list #2014

Closed thuang6 closed 1 day ago

thuang6 commented 4 days ago

Type of Change

documentation

wenhuach21 commented 4 days ago

it's on finding track not the main track, shall we add this?

thuang6 commented 4 days ago

it's on finding track not the main track, shall we add this?

It is also part of EMNLP, IMO, it is ok to add it.