cbfinn / gps

Guided Policy Search
http://rll.berkeley.edu/gps/
Other
594 stars 239 forks source link

Move to the OpenAI mujoco bindings for better visualization #65

Open avisingh599 opened 7 years ago

avisingh599 commented 7 years ago

NEW

With the code in this repo, we can use the OpenAI python bindings for mujoco, which use the native mujoco visualizer. It is much better than the current osg visualizer and allows textures, lighting variations, shadows etc.

ISSUES

  1. When running vision-based tasks, there is a small visualizer window for every condition, and two large windows apart from that. One of the two large windows give you the "robot" view, so that you get a better view of what the robot is doing, while the other window can be used for panning/zooming to look around. There is only a single (large) window when doing non-vision stuff. I have tried turning the small windows invisible, but it results in blank images, and I do not yet know a fix for that.

  2. You will see the following two lines in agent_mjc.py

for j in range(5):
    self._viewer[i].render()

Ideally, we should not need the above lines, but without them, the images that I get for policy learning are blank. I am looking for a way to fix this.

Changes to hyperparams.py

I tested the following examples: reacher_vision, mjc_example, mjc_badmm_example. mjc_peg_images seems to be throwing some tensorflow errors, so it has not been tested. Convergence is a bit slower for the reacher (presumably because of a slightly higher quality visualization), but it learns to perform the task in a reasonable number of iterations (~8). I have made changes to all the hyperparams.py files that use mujoco, and these changes are:

  1. You now need to specify the camera position in a slightly different fashion (a lookat point, distance, and azimuthal and elevations angles).

  2. The pos_body_offset argument now needs to be a list of lists, so that we can apply different displacements to different objects for every condition. The older version of agent_mjc just applied the same displacement everywhere.
    have not tested mjc_mdgps_example since it uses caffe - I can change it to use Tensorflow and test though (have tested mdgps on the reacher though).

Feedback

There's a probably a lot that can be fixed/improved here, so I welcome all kinds of feedback. Thanks!

More

I will probably add a pushing task in the repo sometime in the future which will have textures etc.

avisingh599 commented 7 years ago

There is a bug in agent_mjc.py - the image dimensions are not shuffled before they are flattened. I think the multi_modal_network and multi_modal_network_fp expect images in different formats (h,w,c and c,w,h respectively), and my current code is h,w,c. I will make mine compatible with the fp network.