tomgoldstein / loss-landscape

Code for visualizing the loss landscape of neural nets
MIT License
2.72k stars 388 forks source link

Two problems about paper #24

Open seanM29 opened 5 years ago

seanM29 commented 5 years ago
Jamesswiz commented 5 years ago

I have the same query after reading the paper.

Can the authors please comment?

liiliiliil commented 3 years ago

I also don't understand the first question. : (

For the second one, I think the key is to prove that convex-looking regions in projected surface is relatively convex in original surface. A small absolute ratio means that the max eigenvalue is big enough compared to min eigenvalue which may be a negative value, that is the postive eigenvalue is dominant, so a convex-looking region in projected surface which has a small absolute ratio is a relatively convex region in original surface.

knowlen commented 3 years ago

Not affiliated with the paper, but in non-convex optimization it is generally believed that wider minima should generalize better than sharp minima. This clip from Leo Dirac (start at 16:30) conveys the intuition. The paper result capture it empirically.