flowersteam / lamorel

Lamorel is a Python library designed for RL practitioners eager to use Large Language Models (LLMs).
MIT License
176 stars 14 forks source link

Can't start PPO_finetuning example with 1 machine and 1 GPU #14

Closed tokarev-i-v closed 10 months ago

tokarev-i-v commented 12 months ago

Hello! Have a problem with starting PPO_finetuning example with only 1 machine and 1 GPU. But succesfully started examples in https://github.com/flowersteam/Grounding_LLMs_with_online_RL with lamorel provided inside. (lamorel 0.1)

Looks like problem with GPU and processes mapping:

RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
ClementRomac commented 12 months ago

Hi @tokarev-i-v,

Thanks for reaching out!

I updated the readme as it was misleading (see PR #15). Indeed when GPU(s) are available, Accelerate automatically tries to allocate a different device to each process. In your case the lamorel launcher starts two processes yet only one GPU is available. To avoid this, you must launch two separate processes by hand (each being in the end a single process for Accelerate):