issues
search
wejoncy
/
QLLM
A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ, and export to onnx/onnx-runtime easily.
Apache License 2.0
145
stars
14
forks
source link
fix attn_implementation
#90
Closed
wejoncy
closed
8 months ago