Closed happierpig closed 7 months ago
This PR integrates FP4 quantization (non-uniform quant) into GPTQ codebase. Atom can apply FP4 quant on weight quantization now.
LGTM.
This PR integrates FP4 quantization (non-uniform quant) into GPTQ codebase. Atom can apply FP4 quant on weight quantization now.