chufanchen / read-paper-and-code

0 stars 0 forks source link

arXiv 2021 | Training Larger Networks for Deep Reinforcement Learning #151

Closed chufanchen closed 5 months ago

chufanchen commented 5 months ago

https://arxiv.org/abs/2102.07920

chufanchen commented 5 months ago

It has been reported in some studies that deep RL agents experience instability while training with larger networks. This is contrary to our intuition considering the recent progress on solving computer vision tasks such ViT: larger and more complex network architectures have proven to achieve better performance.

Sutton identifies a deadly triad of function approximation, bootstrapping, and off-policy learning. When these three properties are combined, learning can be unstable, and potentially diverge with the value estimates becoming unbounded. Some prior works have challenged to mitigate this problem including

This paper try to mitigate this problem related to function approximation. Previous work(e.g. make MLP/CNN network larger) concluded the larger networks tend to perform better, but also become unstable and prone to diverge more. For on-policy methods, too small or large networks can cause significant drop in performance of the policy.

To build a large network,

chufanchen commented 5 months ago

Methods

  1. Decoupling Representation Learning from RL: OFENet+auxiliary task $\rightarrow$ decoupling unsupervised pretraining from downstream task
  2. Distributed Training: Ape-X $\rightarrow$ mitigate overfitting and rank collapse issues of Q-networks
  3. Network Architectures: DenseNet $\rightarrow$ improved flow of information and gradients throughout the network