vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
http://docs.cleanrl.dev
Other
5.02k stars 575 forks source link

Deprecate `ppo_procgen.py` in favor of EnvPool #340

Closed vwxyzjn closed 8 months ago

vwxyzjn commented 1 year ago

Problem Description

Given the EnvPool==0.8.0 release by @YukunJ, @LeoGuo98, @Trinkle23897 (https://github.com/sail-sg/envpool/pull/197), we can go ahead and deprecate ppo_procgen.py in favor of #338, which should also work with procgen but gives us the benefit of JAX, EnvPool's Async API, and a more concise codebase.

vwxyzjn commented 1 year ago

The preliminary proof-of-concept is really encouraging. Few lines of change in #338 result in ~7x improvement on overall training speed. Note that this proof-of-concept does not include reward normalization, which could be a bottleneck in ppo_procgen.py. Further investigation is warranted.

image
vwxyzjn commented 8 months ago

Folks can just use https://github.com/vwxyzjn/cleanba