Episode [1/1]: 0%| | 0/2500 [00:00<?, ?it/s]/root/miniconda3/envs/openrlhf/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:535: UserWarning:num_beamsis set to 1. However,early_stoppingis set toTrue-- this flag is only used in beam-based generation modes. You should setnum_beams>1or unsetearly_stopping`.
warnings.warn(
rank3: Traceback (most recent call last):
rank3: File "/workspace/OpenRLHF/examples/train_ppo.py", line 347, in
rank3: File "/workspace/OpenRLHF/examples/train_ppo.py", line 240, in train
rank3: File "/root/.local/lib/python3.10/site-packages/openrlhf/trainer/ppo_trainer.py", line 176, in fit
rank3: experience = self.experience_maker.make_experience(rand_prompts, self.generate_kwargs)
rank3: File "/root/miniconda3/envs/openrlhf/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
rank3: return func(*args, kwargs)
rank3: File "/root/.local/lib/python3.10/site-packages/openrlhf/trainer/ppo_utils/experience_maker.py", line 120, in make_experience
rank3: sequences, attention_mask, action_mask = self.actor.generate(inputs, *generate_kwargs)
rank3: File "/root/miniconda3/envs/openrlhf/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
rank3: return func(args, kwargs)
rank3: File "/root/.local/lib/python3.10/site-packages/openrlhf/models/actor.py", line 136, in generate
rank3: return self.process_sequences(sequences, input_ids.size(1), eos_token_id, pad_token_id)
rank3: File "/root/.local/lib/python3.10/site-packages/openrlhf/models/actor.py", line 156, in process_sequences
rank3: mask = (mask <= eos_indices) & (mask >= first_token_indices)
rank3: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:3 and cpu!
[2024-06-14 18:40:51,919] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 289741
[2024-06-14 18:40:52,442] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 289742
[2024-06-14 18:40:52,994] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 289743
[2024-06-14 18:40:53,500] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 289744
[2024-06-14 18:40:53,501] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 289745
[2024-06-14 18:40:54,047] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 289746
[2024-06-14 18:40:54,596] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 289747
[2024-06-14 18:40:55,183] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 289748`
Error got when I ran train_ppo_llama.sh
Episode [1/1]: 0%| | 0/2500 [00:00<?, ?it/s]/root/miniconda3/envs/openrlhf/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:535: UserWarning:
num_beamsis set to 1. However,
early_stoppingis set to
True-- this flag is only used in beam-based generation modes. You should set
num_beams>1or unset
early_stopping`. warnings.warn( rank3: Traceback (most recent call last): rank3: File "/workspace/OpenRLHF/examples/train_ppo.py", line 347, inrank3: File "/workspace/OpenRLHF/examples/train_ppo.py", line 240, in train
rank3: File "/root/.local/lib/python3.10/site-packages/openrlhf/trainer/ppo_trainer.py", line 176, in fit rank3: experience = self.experience_maker.make_experience(rand_prompts, self.generate_kwargs) rank3: File "/root/miniconda3/envs/openrlhf/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context rank3: return func(*args, kwargs) rank3: File "/root/.local/lib/python3.10/site-packages/openrlhf/trainer/ppo_utils/experience_maker.py", line 120, in make_experience rank3: sequences, attention_mask, action_mask = self.actor.generate(inputs, *generate_kwargs) rank3: File "/root/miniconda3/envs/openrlhf/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context rank3: return func(args, kwargs) rank3: File "/root/.local/lib/python3.10/site-packages/openrlhf/models/actor.py", line 136, in generate rank3: return self.process_sequences(sequences, input_ids.size(1), eos_token_id, pad_token_id) rank3: File "/root/.local/lib/python3.10/site-packages/openrlhf/models/actor.py", line 156, in process_sequences rank3: mask = (mask <= eos_indices) & (mask >= first_token_indices) rank3: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:3 and cpu! [2024-06-14 18:40:51,919] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 289741 [2024-06-14 18:40:52,442] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 289742 [2024-06-14 18:40:52,994] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 289743 [2024-06-14 18:40:53,500] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 289744 [2024-06-14 18:40:53,501] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 289745 [2024-06-14 18:40:54,047] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 289746 [2024-06-14 18:40:54,596] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 289747 [2024-06-14 18:40:55,183] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 289748`