vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
http://docs.cleanrl.dev
Other
5.54k stars 631 forks source link

JAX Integration with CleanRL #218

Closed vwxyzjn closed 1 year ago

vwxyzjn commented 2 years ago

Problem Description

Given the incredible performance of the DDPG + JAX prototype (https://github.com/vwxyzjn/cleanrl/pull/187), it's worth prototyping JAX with other algorithms as well! This issue tracks the overall progress of integrating JAX with CleanRL.

Useful resources

Common gotchas and errors:

Useful pattern when extending

In CleanRL a filediff is incredibly helpful. For example, if I want to learn how TD3 is different from DDPG, I could do

image

Contribution process

There is a contribution checklist to help streamline the contribution process. For each new contribution, we'd need to add documentation, tests, run benchmark experiments, etc. See https://github.com/vwxyzjn/cleanrl/pull/186 as an example.

Tracked issues