当我调用8bit/4bit版本推理时报错:Calling cuda() is not supported for 4-bit or 8-bit quantized models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correct dtype. 我在网上查着可能是transformers库或者deepspeed库版本有问题。请问是这个原因吗?如果是的话可以分享下推理成功的环境的这两个库的版本吗?
测试环境:
transformers 4.32.0
deepspeed 0.9.2
详细报错:
Traceback (most recent call last):
File "/opt/conda/envs/groma/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/envs/groma/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/mnt/zhang/project/Groma-main/groma/eval/run_groma.py", line 138, in
eval_model(model_name, args.quant_type, args.image_file, args.query)
File "/mnt/zhang/project/Groma-main/groma/eval/run_groma.py", line 58, in eval_model
model = GromaModel.from_pretrained(model_name, **kwargs).cuda()
File "/opt/conda/envs/groma/lib/python3.9/site-packages/transformers/modeling_utils.py", line 1998, in cuda
raise ValueError(
ValueError: Calling cuda() is not supported for 4-bit or 8-bit quantized models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correct dtype.
当我调用8bit/4bit版本推理时报错:Calling
eval_model(model_name, args.quant_type, args.image_file, args.query)
File "/mnt/zhang/project/Groma-main/groma/eval/run_groma.py", line 58, in eval_model
model = GromaModel.from_pretrained(model_name, **kwargs).cuda()
cuda()
is not supported for4-bit
or8-bit
quantized models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correctdtype
. 我在网上查着可能是transformers库或者deepspeed库版本有问题。请问是这个原因吗?如果是的话可以分享下推理成功的环境的这两个库的版本吗? 测试环境: transformers 4.32.0 deepspeed 0.9.2 详细报错: Traceback (most recent call last): File "/opt/conda/envs/groma/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/opt/conda/envs/groma/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/mnt/zhang/project/Groma-main/groma/eval/run_groma.py", line 138, inFile "/opt/conda/envs/groma/lib/python3.9/site-packages/transformers/modeling_utils.py", line 1998, in cuda raise ValueError( ValueError: Calling
cuda()
is not supported for4-bit
or8-bit
quantized models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correctdtype
.