How comes the amazing result in Fetch_pick_and_place-v1?

TianhongDai / hindsight-experience-replay

This is the pytorch implementation of Hindsight Experience Replay (HER) - Experiment on all fetch robotic environments.

MIT License

402 stars 75 forks source link

How comes the amazing result in Fetch_pick_and_place-v1? #7

Closed HeegerGao closed 4 years ago

HeegerGao commented 4 years ago

Hi, I find that it is pretty fast to train the agent in the Fetch_pick_and_place-v1 environment without using demonstrations. As we all know, this is a two-stage task, and the reward is quite sparse, which makes it the most difficult task in Fetch_Robot envs. But the time of training the agent in Fetch_pick_and_place-v1 using your code is not much longer than training in other environments(Reach, Slide and Push).

I find you are mainly using code from OpenAI Baselines, but their results show that we need very long time to train the agent in Fetch_pick_and_place-v1 environment if we don't use demonstartions. How do you explain it? Thank you!

henrycharlesworth commented 4 years ago

It's all slightly confusing but I think they're using a different version of the environment when they use demonstrations. If you compare to the HER results of Plappert et al. (https://arxiv.org/pdf/1802.09464.pdf) then the results here are consistent. I have a suspicion the environment they use in what you linked to above is FetchPickAndPlace where all of the goals are above the table (in FetchPickAndPlace-v1 half of the goals are on the table surface, which I expect makes it easier to start learning).

HeegerGao commented 4 years ago

It's all slightly confusing but I think they're using a different version of the environment when they use demonstrations. If you compare to the HER results of Plappert et al. (https://arxiv.org/pdf/1802.09464.pdf) then the results here are consistent. I have a suspicion the environment they use in what you linked to above is FetchPickAndPlace where all of the goals are above the table (in FetchPickAndPlace-v1 half of the goals are on the table surface, which I expect makes it easier to start learning).

Yes, you are right. Thanks for your reminding. The paper also mentioned it at the end page 5.