请问现在支持Yi-34B的awq 4bit部署吗？

ModelTC / lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Apache License 2.0

2.64k stars 210 forks source link

请问现在支持Yi-34B的awq 4bit部署吗？ #291

Open xyfZzz opened 10 months ago

hiworldwzj commented 10 months ago

@xyfZzz 还不能很好的支持，一个是开源实现的triton int4weightonly gemm 算子性能不是很好。还有就是直接加载awq的权重需要去适配相关权重的加载。这个后续会继续优化提升。

xyfZzz commented 10 months ago

@xyfZzz 还不能很好的支持，一个是开源实现的triton int4weightonly gemm 算子性能不是很好。还有就是直接加载awq的权重需要去适配相关权重的加载。这个后续会继续优化提升。

好的，请问，那4bit gptq目前是不是也暂时不支持？

hiworldwzj commented 10 months ago

@xyfZzz 目前只有一些量化计算的算子支持了，默认情况下是直接量化原始的权重，没有做PTQ等权重调整，也还没有适配gptq这种量化后权重的加载。

xyfZzz commented 10 months ago

@xyfZzz 目前只有一些量化计算的算子支持了，默认情况下是直接量化原始的权重，没有做PTQ等权重调整，也还没有适配gptq这种量化后权重的加载。

好的，感谢大佬

RanchiZhao commented 7 months ago

available now？I simply do gptq and awq on Yi-6B, and try to do lora training on it, however, loss is Nan.