AutoGPTQForCausalLM.from_quantized 加载官方4bit量化模型(Llama2-Chinese-13b-Chat-4bit)报错:NameError: name 'autogptq_cuda_256' is not defined
Traceback (most recent call last):
File "/root/Llama2-Chinese/chat_gradio.py", line 90, in
model = AutoGPTQForCausalLM.from_quantized(args.model_name_or_path,low_cpu_mem_usage=True, device="cuda:0", use_triton=False,inject_fused_attention=False,inject_fused_mlp=False)
File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/auto.py", line 94, in from_quantized
return quant_func(
File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/_base.py", line 749, in from_quantized
make_quant(
File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/_utils.py", line 92, in make_quant
make_quant(
File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/_utils.py", line 92, in make_quant
make_quant(
File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/_utils.py", line 92, in make_quant
make_quant(
[Previous line repeated 1 more time]
File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/_utils.py", line 84, in make_quant
new_layer = QuantLinear(
File "/usr/local/lib/python3.10/site-packages/auto_gptq/nn_modules/qlinear/qlinear_cuda_old.py", line 83, in init
self.autogptq_cuda = autogptq_cuda_256
NameError: name 'autogptq_cuda_256' is not defined
AutoGPTQForCausalLM.from_quantized 加载官方4bit量化模型(Llama2-Chinese-13b-Chat-4bit)报错:NameError: name 'autogptq_cuda_256' is not defined
Traceback (most recent call last): File "/root/Llama2-Chinese/chat_gradio.py", line 90, in
model = AutoGPTQForCausalLM.from_quantized(args.model_name_or_path,low_cpu_mem_usage=True, device="cuda:0", use_triton=False,inject_fused_attention=False,inject_fused_mlp=False)
File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/auto.py", line 94, in from_quantized
return quant_func(
File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/_base.py", line 749, in from_quantized
make_quant(
File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/_utils.py", line 92, in make_quant
make_quant(
File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/_utils.py", line 92, in make_quant
make_quant(
File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/_utils.py", line 92, in make_quant
make_quant(
[Previous line repeated 1 more time]
File "/usr/local/lib/python3.10/site-packages/auto_gptq/modeling/_utils.py", line 84, in make_quant
new_layer = QuantLinear(
File "/usr/local/lib/python3.10/site-packages/auto_gptq/nn_modules/qlinear/qlinear_cuda_old.py", line 83, in init
self.autogptq_cuda = autogptq_cuda_256
NameError: name 'autogptq_cuda_256' is not defined