Closed huiwenzhang closed 5 years ago
Hi,
Yes, the source code for the environment has also been open-sourced: https://github.com/avisingh599/multiworld/
The requirements.txt
points to the version of multiworld that was used for the results in the paper: https://github.com/avisingh599/reward-learning-rl/blob/93bb52f75bea850bd01f3c3342539f0231a561f3/requirements.txt#L51
With respect to parallel implementation, we prefer having a unified interface for launching experiments locally and on cloud. For example, we can run our code on a local PC by using the run_example_local
command, and this only needs to be changed to run_example_gce
for running on the Google Compute Engine (assuming you have everything else for GCE setup).
Got it. Thanks for pointing out that. I will dig into deep to see what's happening inside. Besides, I want to know if train you task (such as the visual picker task, it seems this task needs more interaction steps based on your paper) in a single local PC with one GPU, how long it takes to reach a satisfactory performance?
Yes, the visual picking task does take more time since it requires a fair bit of exploration, but you should be able to run one it with one random seed on a PC with one GPU in about 16 hours. If you have a powerful CPU/GPU combination, you might even be able to run 2-3 seeds in parallel without impacting the total running time much.
Also, the default requirements.txt
installs the non-GPU tensorflow, so be sure to install tensorflow-gpu==0.1.13
to take advantage of the GPU if you run these experiments.
Also, closing this issue for now, but feel free to reopen if you have more questions about the environment definition, or create new issues if you have other questions. Thanks for your interest in the codebase!
Hi, you have used a customized environment in your simulation task. I quickly browsed the code and found nothing related to it. Did you open sourced this part of code? By the way, parallel implementation broken ties and coherence of the code, which makes it harder to reproduce the experiment and understand the algorithm. Personally, I prefer a simpler version which can be run in a local PC.