Closed kebijuelun closed 1 year ago
And I also met some error when train rl model with accelerate and deepspeed. The error is different with different transformers version.
with transformers 4.28.1
Traceback (most recent call last):
File "/data/public/aic/lwz/lwz_code/trl/examples/stack_llama/scripts/rl_training.py", line 266, in <module>
pipe_outputs = sentiment_pipe(texts, **sent_kwargs)
File "/root/code/transformers/src/transformers/pipelines/text_classification.py", line 155, in __call__
result = super().__call__(*args, **kwargs)
File "/root/code/transformers/src/transformers/pipelines/base.py", line 1090, in __call__
outputs = list(final_iterator)
File "/root/code/transformers/src/transformers/pipelines/pt_utils.py", line 125, in __next__
processed = self.infer(item, **self.params)
File "/root/code/transformers/src/transformers/pipelines/text_classification.py", line 214, in postprocess
dict_scores = [
File "/root/code/transformers/src/transformers/pipelines/text_classification.py", line 215, in <listcomp>
{"label": self.model.config.id2label[i], "score": score.item()} for i, score in enumerate(scores)
ValueError: can only convert an array of size 1 to a Python scalar
with transformers 4.30.0.dev0
_ /root/miniconda3/lib/python3.10/contextlib.py:79 in inner _
_ _
_ 76 _ _ @wraps(func) _
_ 77 _ _ def inner(*args, **kwds): _
_ 78 _ _ _ with self._recreate_cm(): _
_ _ 79 _ _ _ _ return func(*args, **kwds) _
_ 80 _ _ return inner _
_ 81 _
_ 82 _
_ _
_ /data/public/aic/lwz/lwz_code/trl/trl/trainer/ppo_trainer.py:859 in batched_forward_pass _
_ _
_ 856 _ _ _ _ input_ids = input_kwargs["input_ids"] _
_ 857 _ _ _ _ attention_mask = input_kwargs["attention_mask"] _
_ 858 _ _ _ _
_ _ 859 _ _ _ logprobs = logprobs_from_logits(logits[:, :-1, :], input_ids[:, 1:]) _
_ 860 _ _ _ masks = torch.zeros_like(attention_mask) _
_ 861 _ _ _ masks[:, :-1] = attention_mask[:, 1:] _
_ 862 _
_ _
_ /data/public/aic/lwz/lwz_code/trl/trl/core.py:106 in logprobs_from_logits _
_ _
_ 103 _ # if gt_device != pred_device: _
_ 104 _ # labels = labels.to(pred_device) _
_ 105 _ # print("after: pred_device: {} gt_device: {}".format(logp.device, labels.device)) _
_ _ 106 _ logpy = torch.gather(logp, 2, labels.unsqueeze(2)).squeeze(-1) _
_ 107 _ return logpy _
_ 108 _
_ 109 _
____________________________________________________________________________________________________
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!
(when checking argument for argument index in method wrapper_gather)
Could anyone give an example deepspeed zero3 config and the specific env list, which can train rmo and rl successfully? Thanks.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
hi, I met following error when train llama7b/30b model with deepspeed zero3 (the error for llama7b and llama30b is similar)
training cmd
deepspeed ds_config_zero3.json
env
Does anyone have some idea about this error? or can anyone provide a example deepspeed config which can run successfully?
Thanks.