在进行pfet 和deepspeed模型下训练时间,参数好像没有穿进来。
Traceback (most recent call last):
File "/root/autodl-fs/xxhh/train_temp.py", line 124, in
outputs = model(batch, use_cache=False)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1519, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _call_impl
return forward_call(args, kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
ret_val = func(*args, kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1807, in forward
loss = self.module(*inputs, *kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1519, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _call_impl
return forward_call(args, kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/peft/peft_model.py", line 1178, in forward
return self.base_model(inputs_embeds=inputs_embeds, kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1519, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _call_impl
return forward_call(*args, *kwargs)
File "/root/autodl-fs/xxhh/glm2/modeling_chatglm.py", line 943, in forward
transformer_outputs = self.transformer(
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1519, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _call_impl
return forward_call(*args, **kwargs)
File "/root/autodl-fs/xxhh/glm2/modeling_chatglm.py", line 807, in forward
batch_size, seq_length = input_ids.shape
AttributeError: 'NoneType' object has no attribute 'shape'
Is there an existing issue for this?
Current Behavior
在进行pfet 和deepspeed模型下训练时间,参数好像没有穿进来。 Traceback (most recent call last): File "/root/autodl-fs/xxhh/train_temp.py", line 124, in
outputs = model(batch, use_cache=False)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1519, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _call_impl
return forward_call(args, kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
ret_val = func(*args, kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1807, in forward
loss = self.module(*inputs, *kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1519, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _call_impl
return forward_call(args, kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/peft/peft_model.py", line 1178, in forward
return self.base_model(inputs_embeds=inputs_embeds, kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1519, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _call_impl
return forward_call(*args, *kwargs)
File "/root/autodl-fs/xxhh/glm2/modeling_chatglm.py", line 943, in forward
transformer_outputs = self.transformer(
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1519, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1529, in _call_impl
return forward_call(*args, **kwargs)
File "/root/autodl-fs/xxhh/glm2/modeling_chatglm.py", line 807, in forward
batch_size, seq_length = input_ids.shape
AttributeError: 'NoneType' object has no attribute 'shape'
Expected Behavior
No response
Steps To Reproduce
print("设置deepspeed参数") model, optimizer, _, lr_scheduler = deepspeed.initialize(model=model, args=args, config=ds_config, dist_init_required=True) model.train()
创建文件夹
path = args.save_loss_path print(path) os.makedirs(path, exist_ok=True)
global_step = 0
for epoch in range(args.num_train_epochs): save_loss_file = open(args.save_loss_path + "epoch-{}.txt".format(epoch) , mode="w" , encoding="utf-8")
训练部分代码如下。
Environment
Anything else?
No response