avisingh599 / reward-learning-rl

[RSS 2019] End-to-End Robotic Reinforcement Learning without Reward Engineering
https://sites.google.com/view/reward-learning-rl/
Other
371 stars 68 forks source link

Error in saving the videos #11

Open weijiafeng opened 5 years ago

weijiafeng commented 5 years ago

Hi @avisingh599 @hartikainen , when I am running "softlearning run_example_local examples.classifier_rl --n_goal_examples 10 --task=Image48SawyerDoorPullHookEnv-v0 --algorithm VICERAQ --num-samples 5 --n_epochs 300 --active_query_frequency 10 --video-save-frequency=1", I encountered the following error in saving the videos: (I have used the tag v0.1 of the repository @avisingh599 - could this be the issue?) Thanks!

image
weijiafeng commented 5 years ago

Hi @avisingh599 @hartikainen after a few trials, I am aware that the edits Avi made to save the videos are made on the masters branch, not the v0.1, yet I have to run using v0.1 as I encountered the dm_control issue as stated in my previous issue post (https://github.com/avisingh599/reward-learning-rl/issues/8). I tried overwriting the softlearning/samplers/utils.py file in the v0.1 yet there are further errors:

I think there are substantial updates in the softlearning package which you merged at master branch however this is not reflected in the v0.1. I am just wondering if you could merge softlearning and video saving changes in v0.1 so I could run it using v0.1? Or if you could let me know what changes you've made in v0.1 compared to latest masters to make the dm_control thing work? Thanks a lot for your help!

image
avisingh599 commented 5 years ago

Hi @weijiafeng, v0.1 is actually just an older version of the master branch. The dm_control errors don't quite show up on my side, which is why it has been hard for me to debug them in the master. To get video logging working with the older version (i.e. v0.1), I would suggest taking the <10 line diff from this commit, and adding it to v0.1. Let me know how it goes!

weijiafeng commented 5 years ago

@avisingh599 ok, I will try it later today. Thanks~

By the way, it takes around 50min to obtain results for the first epoch on a macOS with i5 CPU and 8GB memory, is this normal? I saw the terminal showing "Creating offscreen glfw" for 2-3 times - is it normal for this to take a long time to build when running the code everytime? Due to this long code runningl time, it's been very slow for me to debug the codes :-(

In addition, to conduct efficient research:

  1. How long does it take on your end to run "--n_goal_examples 10 --task=Image48SawyerDoorPullHookEnv-v0 --algorithm VICERAQ --num-samples 5 --n_epochs 300 --active_query_frequency 10"?
  2. And what's the GPU allocation to achieve this? (Or does your lab use any other computation resource other than GPU to speed up Deep RL algorithm computation?) Currently I am trying to get it running on my lab's GPU cluster with Titan Vs, and need to get a way around the Mujoco license issue (currently on a trial and student license).

Cheers~

avisingh599 commented 5 years ago

Without a GPU, the first epoch can indeed take 50min. I would suggest using a GPU even for debugging. If that is not possible, you can reduce the epoch_length variable to something like 100 or 200. Also run it with --num-samples=1 if you want to run with just one random seed.

  1. I think I am able to run that experiment in about 5-6 hours.
  2. I use 3-4 jobs per GPU. So, if you want to run five random seeds, you would need two GPUs for about 5-6 hours for the door task. Other tasks might take longer (but need the same GPU allocation). I usually run on V100 or P100 or Titan X GPUs. Yes, you would likely need an institutional licence for running mujoco on non-personal machine.
weijiafeng commented 5 years ago

@avisingh599 Thanks Avi for your sharing!

And just wondering if I want to create a new task, i.e. like switching on laptop, just like the "task=Image48SawyerDoorPullHookEnv", how do I do this? I reckon I need to register this Envs in Gym, and get some successful image of this task? Regarding collecting images for training the Off-Policy VICE, do I need to take the image from different angles/brightness?

Hopefully I don't need a Sawyer arm to create this new task?

Cheers :-)

avisingh599 commented 5 years ago

No, you don't need a new sawyer arm. You can fork my version of https://github.com/avisingh599/multiworld, and define tasks similar to how I have defined them. Look at my commits in that repo to see how it is done. Also, you don't need images form different angles, and you don't need to worry about variable lighting in sim.

weijiafeng commented 5 years ago

@avisingh599 Sounds good~

What about implementing on real robots? what robotic arms should I purchase, besides the Sawyer option? like the ones that support Python, C++ APIs? Does it need to support MoveIT! and ROS as well? the DOF of the robotic arm is not used in this paper right?

Cheers :-)

avisingh599 commented 5 years ago

I only really have experience working with the Sawyer, for which we use just the Intera SDK with ROS. You can use this software package for working with Saywer: https://github.com/mdalal2020/sawyer_control

In general, there is a host of other options with respect to robot arms like the Kuka arms, UR5/UR10, Jaco, Fetch and so on. And yes, we only control end-effector position of the robot, and don't do control on the joint angles of the robot directly.

weijiafeng commented 5 years ago

Cool~ Thanks Avi!