cbfinn / gps

Guided Policy Search
http://rll.berkeley.edu/gps/
Other
594 stars 239 forks source link

Updates to training with vision #58

Closed cbfinn closed 7 years ago

cbfinn commented 7 years ago

This code includes some updates for vision experiments in mujoco, including: -- end-effector point type that has target information deleted [only tested for the reacher, may not be general...] -- proper saving of tensorflow networks, so that they can be loaded later -- feature point architecture in tensorflow -- support for pre-training the fully-connected layers before the vision layers -- support for including visual features in the state and training them end-to-end! (MDGPS only) -- working example with a 2D reacher task with vision

wmontgomery4 commented 7 years ago

LGTM aside from the comments above. I'm going to pull the changes and run the reacher example, and then I'll give another okay.

I'm not really a huge fan of the END_EFFECTOR_POINTS_NO_TARGET to be honest, but I think it's fine for now. One idea Vikash had is to just give the desired idxs for each of the sensor_dims, so it would default to using the full vector, but you could give it specific idxs if you want. I think for now this is fine though, and we can always remove it later.

wmontgomery4 commented 7 years ago

Okay, just ran it and it seems to be working for me. The only issue is that the model saving seems quite slow. It might also be pickling other stuff during that time though (pickling the images seems to be pretty slow).

cbfinn commented 7 years ago

Agreed regarding END_EFFECTOR_POINTS_NO_TARGET. Let's revisit it in the future.

Also agreed that the model saving is quite slow. I think we should explore alternative formats to storing the images that is compressed and faster, without requiring multiple files. I think we discussed npz, hdf5, and pandas at the meeting 2 weeks ago.

Thanks for reviewing!