End-to-End Differentiable Physics for Learning and Control

Problem

Physical simulation environments, such as MuJoCo, are poorly suited for deep learning settings. The environments have some speed and numerical stability issues because those are not natively differentiable. So gradients (e.g., policy gradients for control tasks) must be evaluated via finite differencing. A recent work developed a differentiable physical simulator, which was accomplished by an automatic differentiation framework. But it only supported balls as objects, with limited extensibility.

Solution

This paper proposes and presents a differentiable two-dimensional physics simulator that addresses the main limitations of past work. Specifically, Their system simulates rigid body dynamics via a linear complementarity problem(LCP) which computes the equations of motion subject to contact and friction constraints. It can use general simulation methods for determining the non-differentiable parts of the dynamics(namely, the presence of absence of collisions between convex shapes), while still providing a simulation environment that is end-to-end differentiable(given the observed set of collisions).

Effect

They can embed an entire physical simulation environment as a "layer" in a deep network, enabling agents to both learn the parameters of the environments to match observed behavior and improve control performance via traditional gradient-based learning.

https://papers.nips.cc/paper/2018/hash/842424a1d0595b76ec4fa03c46e8d755-Abstract.html

5g4s / paper