cbfinn / gps

Guided Policy Search
http://rll.berkeley.edu/gps/
Other
594 stars 239 forks source link

RFGPS #62

Open wmontgomery4 opened 7 years ago

wmontgomery4 commented 7 years ago

Here's the commit for RFGPS, it's pretty large but here's the main takeaways:

The one remaining issue is how to initialize the EM. Right now it starts with Kmeans, but I have some code which starts it from random points instead (commented out, near the end of algorithm_mdgps.py in the _cluster_traj_em method).

anuragajay commented 7 years ago

@cbfinn The code looks correct to me. Should I also make the other changes you suggested?

wmontgomery4 commented 7 years ago

Don't worry, I'll handle those other changes Anurag. Thanks for the comments Chelsea!

wmontgomery4 commented 7 years ago

@cbfinn Now that the PIGPS stuff is merged, it's going to be kind of hard to merge this stuff in a way that isn't overly complicated. I think it would make sense to refactor the current codebase first, in a way that allows

wmontgomery4 commented 7 years ago

Whoops, accidentally hit 'close and comment'. I was going to send this in the email thread instead, but I'll just say it here: I think it would make sense to focus on refactoring the current codebase before merging this, so that the rfgps code can be written more succinctly. I've got a few ideas about how that should go:

I've got my quals exam in ~10 days, so I'm going to put this stuff at a lower priority right now, and make a branch off this for the RWR stuff. In the meantime this branch can serve as the public implementation of rfgps. Maybe we should have a refactoring meeting sometime soon to talk about this more?

EdsterG commented 7 years ago

I don't think we should change CostState to only take a single data_type. I found the dictionary of data_type's to be very useful. It may be best to make a new cost function CostStateWeighted?

wmontgomery4 commented 7 years ago

That's a fair point, and I agree that changing the API at this point is unnecessarily confusing. The main reason I wanted to do it was to add the A parameter, as this allows you get arbitrary linear functions of the state (i.e. the distance between the end effector and target). I think that cost_linwp does something similar on one of the other branches though, I'll try to just make it look like that.

cbfinn commented 7 years ago

Having a refactoring meeting sounds like a good idea. I'm pretty busy with ICLR right now (deadline in <7 days), so let's figure it out after your quals.

I agree with your two main points!