Open ai499 opened 1 year ago
请安装sat最新版
pip install SwissArmyTransformer --upgrade
另外建议使用sat版的模型而不是huggingface版,因为sat版会持续更新维护。
@1049451037 更新后还是报错 我想使用CPU推理 是不是不支持
支持,是因为你的代码里有用到gpu的代码,你运行的是哪个文件,自己修改过代码吗?
@1049451037 项目没有看到有CPU加载模型的说明啊
因为不需要说明,正常就可以跑
就是用的transform加载的
from transformers import AutoTokenizer, AutoModel model = AutoModel.from_pretrained("/root/visualglm_6B", trust_remote_code=True).float()
就两行代码
transformers的版本不是我维护的,建议使用sat版本。
直接用cli_demo.py
sat可以用cpu加载哈
好的大佬 我试试
SwissArmyTransformer是0.4.8版本,已经是最新版本了,然后直接用cli_demo.py还是报了这个错,请问是怎么解决的呀? @ai499 @1049451037
[INFO] DeepSpeed/CUDA is not installed, fallback to Pytorch checkpointing.
[INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[WARNING] Failed to load bitsandbytes:No module named 'bitsandbytes'
[INFO] building VisualGLMModel model ...
Traceback (most recent call last):
File "/work/home/VisualGLM-6B-main/cli_demo.py", line 103, in
试一下安装github最新版sat
git clone https://github.com/THUDM/SwissArmyTransformer
cd SwissArmyTransformer
pip install .
试一下安装github最新版sat
git clone https://github.com/THUDM/SwissArmyTransformer cd SwissArmyTransformer pip install .
谢谢解决了这个问题!但又出现了新问题,推理时要是使用gpu,就会报以下错误: RuntimeError: The NVIDIA driver on your system is too old (found version 11070). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org/ to install a PyTorch version that has been compiled with your version of the CUDA driver. 使用cpu可以跑,但是速度太慢了,我的cuda版本是11.7不低吧,所以这个项目对cuda版本有什么要求吗?
呃,cpu推理是不支持cuda的呀,cuda是gpu的
我们对cuda版本没有要求,只是你的pytorch对应的cuda版本需要和你的机器cuda版本对应
呃,cpu推理是不支持cuda的呀,cuda是gpu的
对,我意思就是用不了gpu,看了下torch.cuda.is_available()确实是false,感谢回复
如果想用CPU获取结果 , 内存需要30GB
修改:/usr/local/python3/lib/python3.10/site-packages/sat/arguments.py torch.distributed.init_process_group( backend="gloo",
world_size=args.world_size, rank=args.rank,
init_method=init_method)
backend=args.distributed_backend, 替换成 backend="gloo", 就可以CPU生成
推理代码
from transformers import AutoTokenizer, AutoModel model = AutoModel.from_pretrained("/root/visualglm_6B", trust_remote_code=True).float()
报错信息: [2023-08-22 08:45:55,509] [INFO] DeepSpeed/CUDA is not installed, fallback to Pytorch checkpointing. [2023-08-22 08:45:55,529] [WARNING] DeepSpeed Not Installed, you cannot import training_main from sat now. Traceback (most recent call last): File "", line 1, in
File "/root/.local/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 488, in from_pretrained
return model_class.from_pretrained(
File "/root/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2700, in from_pretrained
model = cls(config, *model_args, model_kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/visualglm_6B/modeling_chatglm.py", line 1345, in init
self.image_encoder = BLIP2(config.eva_config, config.qformer_config)
File "/root/.cache/huggingface/modules/transformers_modules/visualglm_6B/visual.py", line 59, in init
self.vit = EVAViT(EVAViT.get_args(eva_args))
File "/root/.cache/huggingface/modules/transformers_modules/visualglm_6B/visual.py", line 20, in init
super().init(args, transformer=transformer, parallel_output=parallel_output, kwargs)
File "/root/.local/lib/python3.10/site-packages/sat/model/official/vit_model.py", line 110, in init
super().init(args, transformer=transformer, parallel_output=parallel_output, kwargs)
File "/root/.local/lib/python3.10/site-packages/sat/model/base_model.py", line 88, in init
success = _simple_init(model_parallel_size=args.model_parallel_size)
File "/root/.local/lib/python3.10/site-packages/sat/arguments.py", line 304, in _simple_init
if initialize_distributed(args): # first time init model parallel, print warning
File "/root/.local/lib/python3.10/site-packages/sat/arguments.py", line 507, in initialize_distributed
torch.distributed.init_process_group(
File "/root/.local/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 907, in init_process_group
default_pg = _new_process_group_helper(
File "/root/.local/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 1024, in _new_process_group_helper
backend_class = ProcessGroupNCCL(backend_prefix_store, group_rank, group_size, pg_options)
RuntimeError: ProcessGroupNCCL is only supported with GPUs, no GPUs found!