createamind / emach

EMACH: Efficient Model-based Agent with Curiosity and Hierarchy for Reinforcement Learning.
3 stars 0 forks source link

Plan #2

Open chenkaiyu1997 opened 5 years ago

chenkaiyu1997 commented 5 years ago

Four keypoints:

  1. Learning Dynamics.
    • Firstly try: Ensemble Dynamics using MSE. (MB-MPO)
    • Then, try RNN.
    • Model uncertainty: MDN or GANonZ , on 1234 dataset.
    • Afterwards: Use Stacked Images.
    • Multi-level Long-term Dynamics.
  2. Planning
    • MPC
    • MCTS
  3. Curiosity & Exploration
    • Ensemble Value Functions (RPF)
    • RND
  4. Hierachy (optional)
    • Tree-like Option Discovery & use and disuse
    • DAIYN
chenkaiyu1997 commented 5 years ago
  1. Add MPO, MPC, RPF & Test. DDL1.6 1.5 Start writing a paper.
  2. Try RNN-MDN or GAN/NCE on 1234 Dataset, and Multi-level Long-term Dynamics.
  3. If 2 ok, then add RNN-MDN/GAN/NCE & Test.
  4. Try Tree-like Option Discovery & use and disuse, or DAIYN.
zdx3578 commented 5 years ago

Ensemble Value Functions (RPF):Randomized Prior Functions for Deep Reinforcement Learning ref by polo and openai RND; https://blog.openai.com/reptile/ meta learning; https://github.com/openai/supervised-reptile https://github.com/tristandeleu/pytorch-maml-rl 代码不少

https://github.com/eambutu/snail-pytorch https://github.com/sagelywizard/snail

https://github.com/thanard/me-trpo https://sites.google.com/view/mb-mpo/code

infobot(IB) code will modeb base: EMI(IB)

sac +HER . polo--rpf ; her:目标

infobot . goal ref : HER, Unsupervised Meta-Learning for Reinforcement Learning 意图-无监督-LEARNING A PRIOR OVER INTENT VIA META-INVERSE REINFORCEMENT LEARNING https://arxiv.org/abs/1805.12573v3