vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
http://docs.cleanrl.dev
Other
5.4k stars 616 forks source link

Question about the `noise-clip` parameter in DDPG. #419

Closed helpingstar closed 10 months ago

helpingstar commented 1 year ago

https://github.com/vwxyzjn/cleanrl/blob/7e24ae238eab6a8e7efbbf452cb4a8922bcda73f/cleanrl/ddpg_continuous_action.py#L65-L66

It doesn't appear to be used in the code and there doesn't appear to be any mention of it in the documentation.

How should I use this parameter?

https://github.com/vwxyzjn/cleanrl/blob/7e24ae238eab6a8e7efbbf452cb4a8922bcda73f/cleanrl/ddpg_continuous_action.py#L196

Can I just add torch.clip(or torch.clamp) to the above code as shown below?

actions += torch.clip(torch.normal(0, actor.action_scale * args.exploration_noise), -args.noise_clip, args.noise_clip)
glass1720 commented 10 months ago

This is a parameter introduced in TD3, an extension of DDPG (see http://proceedings.mlr.press/v80/fujimoto18a/fujimoto18a.pdf). Reading the CleanRL docs, its clear the DDPG implementation was heavily inspired by the TD3 authors implementation of TD3 and DDPG. My guess is this parameter got accidentally added to the parser because its required for TD3 and can safely be removed from this DDPG implementation.

helpingstar commented 10 months ago

@glass1720 Thank you