Closed tokarev-i-v closed 1 year ago
Hi @tokarev-i-v,
Thanks for reaching out!
I updated the readme as it was misleading (see PR #15). Indeed when GPU(s) are available, Accelerate automatically tries to allocate a different device to each process. In your case the lamorel launcher starts two processes yet only one GPU is available. To avoid this, you must launch two separate processes by hand (each being in the end a single process for Accelerate):
python -m lamorel_launcher.launch --config-path absolute/path/to/project/examples/configs --config-name local_gpu_config rl_script_args.path=absolute/path/to/project/examples/example_script.py lamorel_args.accelerate_args.machine_rank=0
python -m lamorel_launcher.launch --config-path absolute/path/to/project/examples/configs --config-name local_gpu_config rl_script_args.path=absolute/path/to/project/examples/example_script.py lamorel_args.accelerate_args.machine_rank=1
Hello! Have a problem with starting PPO_finetuning example with only 1 machine and 1 GPU. But succesfully started examples in https://github.com/flowersteam/Grounding_LLMs_with_online_RL with lamorel provided inside. (lamorel 0.1)
Looks like problem with GPU and processes mapping: