Open karl-tao-zhang opened 1 year ago
CUDA_VISIBLE_DEVICES=0,1,2,3 python rl_training.py \ --base_model_name baichuan-inc/baichuan-7B \ --merged_sft_model_path /root/autodl-tmp/LLM/weights/sft_lora \ --sft_model_lora_path /root/autodl-tmp/LLM/weights/sft_lora \ --reward_model_lora_path /root/autodl-tmp/LLM/weights/rm_lora \ --adafactor False \ --save_freq 10 \ --output_max_length 256 \ --batch_size 2 \ --gradient_accumulation_steps 2 \ --batched_gen True \ --ppo_epochs 4 \ --seed 0 \ --learning_rate 1e-5 \ --early_stopping True \ --output_dir /root/autodl-tmp/LLM/weights/ppo_lora \
4张3090 显存不够换到了 4张A40, 出现上述错误, 出现错误后, 我去 trl的issues找了找相关的代码, 说是要这么解决吗? tokenizer.eos_token_id = model.config.eos_token_id tokenizer.pad_token = tokenizer.eos_token
1张卡才行
Using pad_token, but it is not set yet. Loading base model for ppo training... 加载base 加载lora 加载ppo WARNING:root:A <class 'peft.peft_model.PeftModelForCausalLM'> model is loaded from '/root/autodl-tmp/LLM/weights/sft_lora', and no v_head weight is found. This IS expected if you are not resuming PPO training. Loading base model for reward model... The argument
trust_remote_code
is to be used with Auto classes. It has no effect here and is ignored. Some weights of BaichuanForSequenceClassification were not initialized from the model checkpoint at baichuan-inc/baichuan-7B and are newly initialized: ['score.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. 开始训练 0it [00:00, ?it/s]--------------------- CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile withTORCH_USE_CUDA_DSA
to enable device-side assertions.0 0it [00:10, ?it/s] Traceback (most recent call last): File "rl_training.py", line 331, in
response_tensors = ppo_trainer.generate(
File "/root/miniconda3/lib/python3.8/site-packages/trl/trainer/ppo_trainer.py", line 446, in generate
return self._generate_batched(
File "/root/miniconda3/lib/python3.8/site-packages/trl/trainer/ppo_trainer.py", line 503, in _generate_batched
generations = self.accelerator.unwrap_model(self.model).generate(padded_inputs, generation_kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/trl/models/modeling_value_head.py", line 198, in generate
return self.pretrained_model.generate(args, kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/peft/peft_model.py", line 975, in generate
outputs = self.base_model.generate(kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(args, *kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/transformers/generation/utils.py", line 1648, in generate
return self.sample(
File "/root/miniconda3/lib/python3.8/site-packages/transformers/generation/utils.py", line 2730, in sample
outputs = self(
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/hooks.py", line 166, in new_forward
return module._hf_hook.post_forward(module, output)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/hooks.py", line 305, in post_forward
output = send_to_device(output, self.input_device, skip_keys=self.skip_keys)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/utils/operations.py", line 160, in send_to_device
{
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/utils/operations.py", line 161, in
k: t if k in skip_keys else send_to_device(t, device, non_blocking=non_blocking, skip_keys=skip_keys)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/utils/operations.py", line 151, in send_to_device
return honor_type(
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/utils/operations.py", line 83, in honor_type
return type(obj)(generator)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/utils/operations.py", line 152, in
tensor, (send_to_device(t, device, non_blocking=non_blocking, skip_keys=skip_keys) for t in tensor)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/utils/operations.py", line 151, in send_to_device
return honor_type(
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/utils/operations.py", line 83, in honor_type
return type(obj)(generator)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/utils/operations.py", line 152, in
tensor, (send_to_device(t, device, non_blocking=non_blocking, skip_keys=skip_keys) for t in tensor)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/utils/operations.py", line 167, in send_to_device
return tensor.to(device, non_blocking=non_blocking)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "rl_training.py", line 364, in
print(question_tensors)
File "/root/miniconda3/lib/python3.8/site-packages/torch/_tensor.py", line 426, in repr
return torch._tensor_str._str(self, tensor_contents=tensor_contents)
File "/root/miniconda3/lib/python3.8/site-packages/torch/_tensor_str.py", line 636, in _str
return _str_intern(self, tensor_contents=tensor_contents)
File "/root/miniconda3/lib/python3.8/site-packages/torch/_tensor_str.py", line 567, in _str_intern
tensor_str = _tensor_str(self, indent)
File "/root/miniconda3/lib/python3.8/site-packages/torch/_tensor_str.py", line 327, in _tensor_str
formatter = _Formatter(get_summarized_data(self) if summarize else self)
File "/root/miniconda3/lib/python3.8/site-packages/torch/_tensor_str.py", line 111, in init
value_str = "{}".format(value)
File "/root/miniconda3/lib/python3.8/site-packages/torch/_tensor.py", line 872, in format
return self.item().format(format_spec)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.