openai / baselines

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
MIT License
15.81k stars 4.88k forks source link

HER train success rate is lower than test success rate #468

Closed xuhuazhe closed 6 years ago

xuhuazhe commented 6 years ago

May I ask why the train success rate for FetchPickAndPlace-v1 is only 0.5 while test success rate is very high?

hi-abhi commented 6 years ago

The training algorithm is not meant to run optimally. It selects a random action with probability random_eps and adds some noise to the action, noise_eps.

These perturbations help training by helping the agent explore, but generally prevent the training success rate from ever reaching 100 percent. That is why the code includes an evaluation rollout worker in which both of these values are set to 0 so that the agent can achieve much higher success rate without random actions penalizing it.

xuhuazhe commented 6 years ago

Thanks a lot!