LlamaFamily / Llama-Chinese

Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
https://llama.family
14.01k stars 1.26k forks source link

AutoGPTQForCausalLM.from_quantized 加载官方4bit量化模型报错:NameError: name 'autogptq_cuda_256' is not defined #75

Open gpww opened 1 year ago

gpww commented 1 year ago

AutoGPTQForCausalLM.from_quantized 加载官方4bit量化模型(Llama2-Chinese-13b-Chat-4bit)报错:NameError: name 'autogptq_cuda_256' is not defined

Traceback (most recent call last): File "/root/Llama2-Chinese/chat_gradio.py", line 90, in model = AutoGPTQForCausalLM.from_quantized(args.model_name_or_path,low_cpu_mem_usage=True, device="cuda:0", use_triton=False,inject_fused_attention=False,inject_fused_mlp=False) File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/auto.py", line 94, in from_quantized return quant_func( File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/_base.py", line 749, in from_quantized make_quant( File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/_utils.py", line 92, in make_quant make_quant( File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/_utils.py", line 92, in make_quant make_quant( File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/_utils.py", line 92, in make_quant make_quant( [Previous line repeated 1 more time] File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/_utils.py", line 84, in make_quant new_layer = QuantLinear( File "/usr/local/lib/python3.10/site-packages/auto_gptq/nn_modules/qlinear/qlinear_cuda_old.py", line 83, in init self.autogptq_cuda = autogptq_cuda_256 NameError: name 'autogptq_cuda_256' is not defined

ZHangZHengEric commented 1 year ago

autogptq 的库安装有问题