Given the incredible performance of the DDPG + JAX prototype (#187), it's worth prototyping TD3 + JAX as well. @joaogui1 is super experienced with JAX and has expressed interest in working on this. Thanks @joaogui1 for your interest! This issue tracks the development effort.
I suggest extending the DDPG prototype link to work with TD3. Here is a couple of additional resources:
To see exactly how CleanRL's DDPG differs from TD3, a filediff between ddpg_continuous_action.py and td3_continuous_action.py would explicitly show the code differences:
Problem Description
Given the incredible performance of the DDPG + JAX prototype (#187), it's worth prototyping TD3 + JAX as well. @joaogui1 is super experienced with JAX and has expressed interest in working on this. Thanks @joaogui1 for your interest! This issue tracks the development effort.
I suggest extending the DDPG prototype link to work with TD3. Here is a couple of additional resources:
To see exactly how CleanRL's DDPG differs from TD3, a filediff between
ddpg_continuous_action.py
andtd3_continuous_action.py
would explicitly show the code differences:There is a contribution checklist to help with making the contribution when making the PR. See https://github.com/vwxyzjn/cleanrl/pull/186 as an example.
Thanks again @joaogui1 and let me know if you run into any issues!