tomgoldstein / loss-landscape

Code for visualizing the loss landscape of neural nets
MIT License
2.72k stars 388 forks source link

Why L2 norm leads to narrower landscape near minimal when showing the trajectory? #38

Open Zhangyanbo opened 2 years ago

Zhangyanbo commented 2 years ago

In the figure 9 of your paper, I noticed that by using L2 norm, the landscape becomes more narrow around the minimal point. Which is different from previous figures.

I do know that you are using a different way of choosing vectors by PCA. And it can be understood by a way from-result-to-cause -- that is, L2 norm makes it harder to train, so the convex part is smaller. However, I curious if you have any deeper insight of this pattern? Thanks!