cbfinn / gps

Guided Policy Search
http://rll.berkeley.edu/gps/
Other
594 stars 239 forks source link

MDGPS Cleanup/Refactor #45

Closed wmontgomery4 closed 7 years ago

wmontgomery4 commented 7 years ago

Hey @cbfinn, the 'harley_refactor' PR is pretty far behind at this point, and I have some other new things that I wanted to add. This PR shouldn't really change anything functionally, and includes several things from that other pull request (which I'll close shortly):

1) Fixes issue with policy_sample_mode (no more keep_samples conflicting with it). 2) Restrict linesearch to the log space only, using bisection with a min/max_eta (we already have an explicit min_eta and implicit max_eta, and the bisection works pretty quickly without any of the stability issues from the old version). 3) Cleans up MDGPS quite a bit, making it faster and the code more legible. 4) Fixes the error in mjc_mdgps_example.

There are some larger architectural changes too, that clean up a lot of code: 1) The policy_prior/policy_prior_gmm classes do the fitting now, kind of like how dynamics_lr_prior does the fitting (this is a bit clunky still, but I like it more than having the full _update_policy_fit method) 2) The dynamics prior uses gauss_fit_joint_prior instead of doing the computation itself.

The only real functional change is that MDGPS no longer uses the classic/global step sizes. Now I'm just averaging predicted/actual_impr across the conditions and using a single step size for everything. This makes more sense for clustering experiments, and it's nice to not have to worry about another hyperparameter, which didn't seem to make much of a difference anyways (I still have a step_rule hyperparam, but it should normally be laplace, I'm just using the mc estimates for some debugging with reacher).

I haven't included any of the clustering stuff in here, nor single GMMs (not sure if everyone can use those anyways), and I'm going to base future clustering stuff on this branch.

cbfinn commented 7 years ago

Other than the one comment, this looks good! It would be nice to squash your commits into one commit before merging.

wmontgomery4 commented 7 years ago

Squashed the commits, ready to merge when you are.

cbfinn commented 7 years ago

Thanks Harley!