eureka-research / Eureka

Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models" (ICLR 2024)
https://eureka-research.github.io/
MIT License
2.73k stars 244 forks source link

How to train on multiple GPUs #35

Open DJjiery opened 6 months ago

DJjiery commented 6 months ago

Can you tell me how to train with multiple GPUs, is it the same way as Isacc gym?

ant2022tna commented 6 months ago

Same problem, do you have a solution?

DJjiery commented 6 months ago

Same problem, do you have a solution? In eureka training, the program will automatically find the most free card on the server for training, it can automatically multi-card, the code is in line 187 of eureka.py, the function points to misc.py; However, running train.py with rl_device=cuda: (not the 0th card) in isaacgymenv under the project results in a situation where the tensor is not on the same card, is there a good way to handle this? image

ViktorM commented 5 months ago

@DJjiery you need to set not only rl_device=cuda:X but also sim_device=cuda:X to match the rl_device

DJjiery commented 5 months ago

@DJjiery you need to set not only rl_device=cuda:X but also sim_device=cuda:X to match the rl_device I tried this, but it still outputs the same error, and it only appears on isaacgymenv under eureka, but it doesn't happen on the official IsaacGymEnvs training.