Why L2 norm leads to narrower landscape near minimal when showing the trajectory?

In the figure 9 of your paper, I noticed that by using L2 norm, the landscape becomes more narrow around the minimal point. Which is different from previous figures.

I do know that you are using a different way of choosing vectors by PCA. And it can be understood by a way from-result-to-cause -- that is, L2 norm makes it harder to train, so the convex part is smaller. However, I curious if you have any deeper insight of this pattern? Thanks!

tomgoldstein / loss-landscape

Why L2 norm leads to narrower landscape near minimal when showing the trajectory? #38