Overview

Reading Negative eigenvalues of the Hessian in deep neural networks

Abstract

The loss function of deep networks is known to be non-convex but the precise nature of this nonconvexity is still an active area of research. In this work, we study the loss landscape of deep networks through the eigendecompositions of their Hessian matrix. In particular, we examine how important the negative eigenvalues are and the benefits one can observe in handling them appropriately.

DNN explainability involves understanding the loss function
What tools can be used to learn more about the loss function?
In this paper, the proposed tool is Eigendecomposition of the Hessian Matrix to infer connections between the eigenvalues and eigenvectors and the training related dynamic behaviour
Furthermore in the abstract it implicitly says the currently used training algo are suboptimal for the actual DNNs loss function as they do not handle negative eigenvalues properly hence a strategy to deal with them properly is proposed

NicolaBernini / PapersAnalysis

Paper Read - Negative eigenvalues of the Hessian in deep neural networks #30

Overview