Closed wmontgomery4 closed 7 years ago
Thanks Harley! I'll take a look at this on Sunday. I agree about changing back to a single mujoco viewer. The line search also seems worth speeding up if it's fairly easy to fix. Making an AlgorithmGPS seems lower priority to me at the moment, but feel free to code it if you think that it would be beneficial and make future research code development faster/easier.
Since it seems like the clustering stuff isn't there yet, I removed all of that stuff, but I think this refactor/update is still worthwhile. We can add the cluster stuff to this later once it's all been worked out.
This does not switch to a single GMM overall, but continues to use a GMM per condition. I think this is actually faster/more efficient, and wanted to put off any changes to this structure for a later refactor as it will require quite a few changes.
1) It fixes the issue with the
policy_prior_gmm
wherepolicy_sample_mode
had no effect becausekeep_samples
was set true. Before we were usingmax_samples = 20
, which meant that we were effectively holding onto 4 iterations of samples (when N=5), and running in purereplace
mode actually seems to improve performance as well as speed. 2) It restricts the linesearch to only using the non-log space, which seems to prevent some errors. 3) It cleans up MDGPS quite a bit, making it faster and the code more legible. 4) Fixes the error inmjc_mdgps_example
.@cbfinn feel free to take a look when you get a chance. There are a few more things I want to address though, but they could be put off for a later update: 1) I could make a base AlgorithmGPS class which handles several of the methods that are copied between BADMM/MDGPS. However, MDGPS might change when we add the clustering stuff, so I'm not sure if it's worth the code savings right now. 2) There are some TODOs/questions about the agent_mjc stuff, which I want to look into further (and I also want to switch back to only one viewer if possible). 3) The linesearch seems a bit slower than necessary. I think this could be improved by running golden section search in log-space (golden section search just needs a unimodal function, not necessarily concave).