issues
search
intel
/
neural-speed
An innovative library for efficient LLM inference via low-bit quantization
https://github.com/intel/neural-speed
Apache License 2.0
350
stars
38
forks
source link
[Bug]fix glm4 acc error
#294
Closed
intellinjun
closed
5 months ago
intellinjun
commented
5 months ago
glm4-9b-chat q4j32 result
glm4-9b-chat q4j32 result