avisingh599 / reward-learning-rl

[RSS 2019] End-to-End Robotic Reinforcement Learning without Reward Engineering
https://sites.google.com/view/reward-learning-rl/
Other
370 stars 68 forks source link

Where is the sawyer environment defined? #4

Closed huiwenzhang closed 5 years ago

huiwenzhang commented 5 years ago

Hi, you have used a customized environment in your simulation task. I quickly browsed the code and found nothing related to it. Did you open sourced this part of code? By the way, parallel implementation broken ties and coherence of the code, which makes it harder to reproduce the experiment and understand the algorithm. Personally, I prefer a simpler version which can be run in a local PC.

avisingh599 commented 5 years ago

Hi,

Yes, the source code for the environment has also been open-sourced: https://github.com/avisingh599/multiworld/

The requirements.txt points to the version of multiworld that was used for the results in the paper: https://github.com/avisingh599/reward-learning-rl/blob/93bb52f75bea850bd01f3c3342539f0231a561f3/requirements.txt#L51

avisingh599 commented 5 years ago

With respect to parallel implementation, we prefer having a unified interface for launching experiments locally and on cloud. For example, we can run our code on a local PC by using the run_example_local command, and this only needs to be changed to run_example_gce for running on the Google Compute Engine (assuming you have everything else for GCE setup).

huiwenzhang commented 5 years ago

Got it. Thanks for pointing out that. I will dig into deep to see what's happening inside. Besides, I want to know if train you task (such as the visual picker task, it seems this task needs more interaction steps based on your paper) in a single local PC with one GPU, how long it takes to reach a satisfactory performance?

avisingh599 commented 5 years ago

Yes, the visual picking task does take more time since it requires a fair bit of exploration, but you should be able to run one it with one random seed on a PC with one GPU in about 16 hours. If you have a powerful CPU/GPU combination, you might even be able to run 2-3 seeds in parallel without impacting the total running time much.

Also, the default requirements.txt installs the non-GPU tensorflow, so be sure to install tensorflow-gpu==0.1.13 to take advantage of the GPU if you run these experiments.

avisingh599 commented 5 years ago

Also, closing this issue for now, but feel free to reopen if you have more questions about the environment definition, or create new issues if you have other questions. Thanks for your interest in the codebase!