MAKING SENSE OF REINFORCEMENT LEARNING AND PROBABILISTIC INFERENCE By: Anoymous

This is not published as documented. 2020/01/07

Problem:

. A recent line of research casts ‘RL as inference’ and suggests a particular framework to generalize the RL problem as probabilistic inference

Innovation:

Our paper surfaces a key shortcoming in that approach, and clarifies the sense in which RL can be coherently cast as an inference problem. In particular, an RL agent must consider the effects of its actions upon future rewards and observations: the exploration-exploitation trade-off. We demonstrate that the popular ‘RL as inference’ approximation can perform poorly in even very basic problems. However, we show that with a small modification the framework does yield algorithms that can provably perform well, and we show that the resulting algorithm is equivalent to the recently proposed K-learning, which we further connect with Thompson sampling.

Conclusion:

We show that a simple variant to the RL as inference framework (K-learning) can incorporate uncertainty estimates to drive efficient exploration. We support our claims with a series of simple didactic experiments. We leave the crucial questions of how to scale these insights up to large complex domains for future work.

Comment: This is the latest paper regarding “connecting RL with Probabilistic Inference" (Or "RL with Bayesian Inference"?). This is reddit comment:

This paper argues against a proposition from "Reinforcement learning and control as probabilistic inference: Tutorial and review" #8 to generalize RL as a probabilistic inference problem - in other words using statistics and probabilities. And that it is a wrong way because it leads to algorithms that fail to solve even simple problems. The paper proposes that RL already can be viewed as probabilistic problem, no need for generalization. Based on that, the paper shows that K-Learning algorithm can be considered as a probabilistic inference procedure, and empirical experiments show that it works better than the other algorithms.

This is a part of a wider research trend that try to connect Reinforcement Learning and Deep Learning with the statistic and probabilistic field.

QiXuanWang / LearningFromTheBest

MAKING SENSE OF REINFORCEMENT LEARNING AND PROBABILISTIC INFERENCE By: Anoymous #10