Denys88 / rl_games

RL implementations
MIT License
848 stars 142 forks source link

Debugging multi-GPU issue #161

Closed vwxyzjn closed 2 years ago

vwxyzjn commented 2 years ago

In IsaacGymEnvs, rl-games + multiGPU seems to have some issues. As shown in the screenshot, rl-games + multiGPU performs uses twice amount of data and performs worse than the single GPU setting in Ant

image

This issue tracks the investigation of this issue.

Proposed debugging route

I suggest making sure we make sure there is no loss in sample efficiency first before scaling to more envs by matching implementation details in our prototype in CleanRL: https://cleanrl-git-new-multi-gpu-vwxyzjn.vercel.app/rl-algorithms/ppo/#implementation-details_6.

Identified issues:

1. Seeding logic and configuration issue

We need to seed multiGPU processes with different seeds to decorrelate experience, otherwise the multiGPU processes will produce the exact observations.

Configuration-wise we can set the overall seed with params.seed and env seed with params.config.env_config.seed, so if params.config.env_config.seed is set but params.seed is not set, we get identical observations from the environments as shown below:

image

This is probably ok since the agent still samples different actions, but it's nonetheless a problem. The correct implementation is to use seed = seed + local_rank.

2. stepping logic issue

After fixing #163, I was able to match the sample efficiency in the single GPU setting:

image

However, the wall time is slower than I had expected. On a separate benchmark I made with CleanRL, the experiments show horovod should make Ant step 20% faster.

Maybe it's the averaging stats overhead? In the CleanRL benchmark experiments I did not mess with stats at all.

image

Denys88 commented 2 years ago

@vwxyzjn can we close it?

vwxyzjn commented 2 years ago

Closed by #171

1tac11 commented 1 year ago

Hi there Is multi instance multi flu working?