openai / baselines

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
MIT License
15.8k stars 4.88k forks source link

Failing to reproduce DDPG results on Plappert et al., 2018 #1080

Open guillefix opened 4 years ago

guillefix commented 4 years ago

I am trying to reproduce the DDPG results in Plappert et al., 2018 arxiv.org/abs/1802.09464.

However, even the simplest environment FetchReach-v1 seems to stay stuck at a return of -49. I have tried many different settings, but the latest one is simply a variation of the instructions for her in https://github.com/openai/baselines/tree/master/baselines/her, but to use just DDPG :

mpiexec -np 12 python3 -m baselines.run --alg=ddpg --env=FetchReach-v1 --num_timesteps=50000

It seems to stays stuck at

| rollout/return                 | -49.9    |
| rollout/return_history         | -49.9    |
DanielTakeshi commented 4 years ago

I ran into several issues with DDPG as well and posted them on the issue reports here earlier.

Note that baselines is no longer maintained. I recommend stable-baselines or other code bases.