Open zdx3578 opened 5 years ago
无监督预训练: Variational Option Discovery Algorithms :: real hierarchical DIVERSITY IS ALL YOU NEED: 有基于SAC的代码
Model-Ensemble Trust-Region Policy Optimization, Kurutach et al, 2018. Algorithm: ME-TRPO. 有 code
Model-Based Reinforcement Learning via Meta-Policy Optimization, Clavera et al, 2018. Algorithm: MB-MPO.
EMI: EXPLORATION WITH MUTUAL INFORMATION MAXIMIZING STATE AND ACTION EMBEDDINGS
polo exploration-- Randomized Prior Functions for Deep Reinforcement Learning ---relate to RND;
无监督预训练: Variational Option Discovery Algorithms :: real hierarchical DIVERSITY IS ALL YOU NEED: 有基于SAC的代码
Model-Ensemble Trust-Region Policy Optimization, Kurutach et al, 2018. Algorithm: ME-TRPO. 有 code
Model-Based Reinforcement Learning via Meta-Policy Optimization, Clavera et al, 2018. Algorithm: MB-MPO.
EMI: EXPLORATION WITH MUTUAL INFORMATION MAXIMIZING STATE AND ACTION EMBEDDINGS
polo exploration-- Randomized Prior Functions for Deep Reinforcement Learning ---relate to RND;