Open kunzeng-ch opened 1 year ago
多卡目前有点问题,可以先在一张卡上跑。
---- 回复的原邮件 ---- | 发件人 | @.> | | 日期 | 2023年07月10日 16:24 | | 收件人 | @.> | | 抄送至 | @.***> | | 主题 | [shuxueslpi/chatGLM-6B-QLoRA] 模型微调出现模型部分参数在cpu上面 (Issue #21) |
大佬,这是怎么回事, 我是直接执行了train_qlora.py文件,然后出现了这个错误 File "/home/jovyan/.cache/huggingface/modules/transformers_modules/chatglm2_6b/modeling_chatglm.py", line 588, in forward hidden_states, kv_cache = layer( File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "/home/jovyan/.local/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, *kwargs) File "/home/jovyan/.cache/huggingface/modules/transformers_modules/chatglm2_6b/modeling_chatglm.py", line 510, in forward attention_output, kv_cache = self.self_attention( File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/home/jovyan/.local/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, kwargs) File "/home/jovyan/.cache/huggingface/modules/transformers_modules/chatglm2_6b/modeling_chatglm.py", line 342, in forward mixed_x_layer = self.query_key_value(hidden_states) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/opt/conda/lib/python3.8/site-packages/peft/tuners/lora.py", line 456, in forward after_A = self.lora_A(self.lora_dropout(x)) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>
好的大佬,密切关注大佬动向
@kunzeng-ch 不好意思,我刚刚看错了,我以为你是两张卡,报的cuda0和cuda1,但是你报的是cuda0和cpu,能告知下你的硬件环境吗?显卡型号,cpu型号,操作系统等? 之前好像没有人遇到过这个问题。
大佬,这是怎么回事, 我是直接执行了train_qlora.py文件,然后出现了这个错误 File "/home/jovyan/.cache/huggingface/modules/transformers_modules/chatglm2_6b/modeling_chatglm.py", line 588, in forward hidden_states, kv_cache = layer( File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "/home/jovyan/.local/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, *kwargs) File "/home/jovyan/.cache/huggingface/modules/transformers_modules/chatglm2_6b/modeling_chatglm.py", line 510, in forward attention_output, kv_cache = self.self_attention( File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/home/jovyan/.local/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, kwargs) File "/home/jovyan/.cache/huggingface/modules/transformers_modules/chatglm2_6b/modeling_chatglm.py", line 342, in forward mixed_x_layer = self.query_key_value(hidden_states) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/opt/conda/lib/python3.8/site-packages/peft/tuners/lora.py", line 456, in forward after_A = self.lora_A(self.lora_dropout(x)) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)