issues
search
intel
/
neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
https://intel.github.io/neural-compressor/
Apache License 2.0
2.18k
stars
252
forks
source link
update readme for fp8
#1979
Closed
xin3he
closed
1 month ago
xin3he
commented
1 month ago
Type of Change
documentation
Description
update readme for fp8
Type of Change
documentation
Description
update readme for fp8