Open chenkaiyu1997 opened 5 years ago
Ensemble Value Functions (RPF):Randomized Prior Functions for Deep Reinforcement Learning ref by polo and openai RND; https://blog.openai.com/reptile/ meta learning; https://github.com/openai/supervised-reptile https://github.com/tristandeleu/pytorch-maml-rl 代码不少
https://github.com/eambutu/snail-pytorch https://github.com/sagelywizard/snail
https://github.com/thanard/me-trpo https://sites.google.com/view/mb-mpo/code
infobot(IB) code will modeb base: EMI(IB)
sac +HER . polo--rpf ; her:目标
infobot . goal ref : HER, Unsupervised Meta-Learning for Reinforcement Learning 意图-无监督-LEARNING A PRIOR OVER INTENT VIA META-INVERSE REINFORCEMENT LEARNING https://arxiv.org/abs/1805.12573v3
Four keypoints: