Ucas-HaoranWei / Vary-toy

Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)
565 stars 41 forks source link

CUDA out of memory #5

Open sixgod-666 opened 5 months ago

sixgod-666 commented 5 months ago

咨询一下显存最少需要多少呢

Ucas-HaoranWei commented 5 months ago

不到9G,线上demo是11G的1080Ti运行的,运行run的时候,在加载模型的地方将device_map = "CUDA"删掉

sixgod-666 commented 5 months ago

File "/workspace/envs/vary/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 285, in set_module_tensor_to_device raise ValueError( ValueError: Trying to set a tensor of shape torch.Size([1024, 1024]) in "weight" (which has shape torch.Size([2048, 1024])), this look incorrect. 加载模型时,出现这个错误怎么办呢

Ucas-HaoranWei commented 5 months ago

这个问题我没法复现,你用的Vary-toy的代码还是Vary的?

sixgod-666 commented 5 months ago

Vary-toy的

Ucas-HaoranWei commented 5 months ago

你把错误crop长一点呢?

sixgod-666 commented 5 months ago

Traceback (most recent call last): File "/workspace/Vary-toy-main/Vary-master/vary/demo/run_qwen_vary.py", line 126, in eval_model(args) File "/workspace/Vary-toy-main/Vary-master/vary/demo/run_qwen_vary.py", line 43, in eval_model model = varyQwenForCausalLM.from_pretrained(model_name, low_cpu_mem_usage=True, trust_remote_code=True) File "/workspace/envs/vary/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3091, in from_pretrained ) = cls._load_pretrained_model( File "/workspace/envs/vary/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3471, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( File "/workspace/envs/vary/lib/python3.10/site-packages/transformers/modeling_utils.py", line 736, in _load_state_dict_into_meta_model set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs) File "/workspace/envs/vary/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 285, in set_module_tensor_to_device raise ValueError( ValueError: Trying to set a tensor of shape torch.Size([1024, 1024]) in "weight" (which has shape torch.Size([2048, 1024])), this look incorrect.

Ucas-HaoranWei commented 5 months ago

我又测试了一下,这个代码没问题,/workspace/envs/vary/ 这个是编译的Vary还是Vary-toy,Vary-toy需要重新编译

sixgod-666 commented 5 months ago

奥奥,我没有重新编译,感谢感谢