Benchmarking Model-Based Reinforcement Learning By: Tingwu Wang, Xuchan Bao, Pieter Abbeel

Problem:

Model-based reinforcement learning (MBRL) is widely seen as having the potentialto be significantly more sample efficient than model-free RL. However, research inmodel-based RL has not been very standardized.

Innovation/Contribution:

To facilitate research in MBRL, in this paper we gather a wide collectionof MBRL algorithms and propose over 18 benchmarking environments speciallydesigned for MBRL. We benchmark these algorithms with unified problem settings,including noisy environments.

Conclusion:

                   RS         MB-MF  PETS-CEM  PETS-RS  ME-TRPO  GPS     PILCO        SVG     MB-MPO  SLBO
Mean rank 5.2 / 10 5.5 / 10 4.0 / 10 4.8 / 10 5.7 / 10 7.7 / 10 9.5 / 10 4.8 / 10 4.7 / 10 4.0 / 10 Median rank 5.5 / 10 7 / 10 4 / 10 5 / 10 6 / 10 8.5 / 10 10 / 10 4 / 10 4.5 / 10 3.5 / 10

Comment: Note this paper is published on 2019/07. Sergey's paper on PETS is published on 2018/05 link. Sergey proposed a new algorithm called PETS that can achieve on par performance as MFRL algorithm like PPO/SAC. (Figure 3)

But in this paper, it's shown that MBRL is still sub-optimal.(Figure1 and Table 1)

Then, on 2019/07, another MBRL paper from panasonic link proposed a new Bayesian MBRL algorithm (VI-MPC + PaETS) which outperform original PETS and also outperform SAC/PPO on certain tasks. (Figure 4)

So the conclusion is conflicted. (See Cheetar score)

QiXuanWang / LearningFromTheBest

Benchmarking Model-Based Reinforcement Learning By: Tingwu Wang, Xuchan Bao, Pieter Abbeel #5