Closed charmeleonz closed 1 month ago
Hi,
Sure! I don't have to exact code at hand, but if I recall we did something like this:
input, _ = next(iter(dataloader)) # (B x H x W x C)
x = torch.randn_like(data)
y = torch.randn_like(data)
losses = []
for i in range(-0.1, 0.1, 0.01):
for j in range(-0.1, 0.1, 0.01):
new_input = input + i*x + j*y
output = model(new_input)
loss = loss_fn(output)
losses.append( (i, j, loss) )
and then plotted the results with loss
being on the z-axis of a surface plot (see example with matplotlib). We did the same for both types of loss functions. Hope this helps.
Hi,
Thanks for the hint. I am curious why the plots shown in the paper don't look like a inverted cone (as in, e.g., TAKD or DOT paper)?
Hi, that is an interesting point! it seems TAKD uses [1] for visualisation. I would not have expected the loss landscape to be perfectly convex along two random input dimensions (unless you use very small step/perturbation sizes). We didn't put too much thought into the step-size, but it may be playing a big role on the granularity/smoothness of the overall plot. For example, our plots might be showing lots of local minima/loss valleys. The main point of our plots was to compare and contrast the global variability of the two different loss functions.
Finally, we also didn't visualise the results around a local minima, but instead just selected a random batch from the training data and did a grid around this point.
[1] Visualizing the Loss Landscape of Neural Nets, NeurIPS 2018
That's really helpful. Cheers!
Dear authors,
Could you kindly advise how I should go about reproducing the loss landscape visualisation (Fig. 4 in paper)?
Many thanks!