vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
http://docs.cleanrl.dev
Other
5.54k stars 631 forks source link

DDPG documnetation tweaks; added Q loss equations and light explanation #145

Closed dosssman closed 2 years ago

dosssman commented 2 years ago

Description

Other comments

Other than that, great job on the pretty complete documentation for DDPG @vwxyzjn @yooceii , and sorry for being late to the party :bow:

Types of changes

Checklist:

If you are adding new algorithms or your change could result in performance difference, you may need to (re-)run tracked experiments. See https://github.com/vwxyzjn/cleanrl/pull/137 as an example PR.

vercel[bot] commented 2 years ago

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

šŸ” Inspect: https://vercel.com/vwxyzjn/cleanrl/CSE1uakxpjPwtLa1Dm9cmjwxxE4g
āœ… Preview: https://cleanrl-git-fork-dosssman-ddpg-docs-tweaks-vwxyzjn.vercel.app

gitpod-io[bot] commented 2 years ago

vwxyzjn commented 2 years ago

This PR is a follow-up on #137. Thanks @dosssman for this fix! I will take a look at it tomorrow :)

Regarding the hard time reproducing ddpg on Mujoco-v1, I was wondering how feasible it would be to run fujimoto's DDPG.py etc.. on free-mujoco

There it is: https://wandb.ai/openrlbenchmark/openrlbenchmark/reports/MuJoCo-sfujim-TD3--VmlldzoxNzIyODIz

dosssman commented 2 years ago

Thanks. The report seems privated though:

image

vwxyzjn commented 2 years ago

Could you try it again?

dosssman commented 2 years ago

All good now