jonasrothfuss / ProMP

Implementation of Proximal Meta-Policy Search (ProMP) as well as related Meta-RL algorithm. Includes a useful experiment framework for Meta-RL.
https://sites.google.com/view/pro-mp
MIT License
231 stars 48 forks source link

confusion about experiments in full_code branch #2

Closed ghost closed 5 years ago

ghost commented 5 years ago

Hey, could you please check the experiments code in full_code branch, e.g., in the variance comparison experiments, I can't not see where is your proposed algorithm, except dice_maml, vpg_maml, vpd_dice_maml, which are confusing since they seems not appears in paper and names are kind of confusing. Besides, are you assuming that each dimension is independent when computing the variance of gradient?

jonasrothfuss commented 5 years ago

If you want to reproduce the ProMP results in the full_code branch, you have to run the following experiment script: https://github.com/jonasrothfuss/ProMP/blob/full_code/experiments/all_envs_eval/ppo_run_all.py

My apologies, the naming conventions are different from the paper + we have additional stuff in the full_code branch. This is why we created the lightweight main branch which has the same naming conventions like the paper. Hence, we highly recommend you to use the main branch.

ghost commented 5 years ago

Hey, @jonasrothfuss, thanks. But it looks like that your pointed code doesn't contains computation of gradient variance, could you confirm this?

jonasrothfuss commented 5 years ago

That's correct. You can find the code for computing the gradient variance in the following experiment folder: https://github.com/jonasrothfuss/ProMP/tree/full_code/experiments/gradient_variance