Closed zixi01chen closed 3 weeks ago
No response
请问一下,我看Qwen模型的量化模型效果损失较小,全程都是用Int8推理的吗?还是只是参数Int8,中间反量化为fp16去推理了?
无
Hi, both AWQ and GPTQ are weight quantization.
起始日期 | Start Date
No response
实现PR | Implementation PR
No response
相关Issues | Reference Issues
No response
摘要 | Summary
请问一下,我看Qwen模型的量化模型效果损失较小,全程都是用Int8推理的吗?还是只是参数Int8,中间反量化为fp16去推理了?
基本示例 | Basic Example
无
缺陷 | Drawbacks
无
未解决问题 | Unresolved questions
No response