fastai / course22

The fast.ai course notebooks
https://course.fast.ai
2.47k stars 971 forks source link

Possible Error #58

Open aalbiol opened 1 year ago

aalbiol commented 1 year ago

In the notebook named "How does a neural net really work?", there is a point where the parameters of a parabola are found using gradients. There is a cell with this content

for i in range(10): loss = quad_mae(abc) loss.backward() with torch.no_grad(): abc -= abc.grad*0.01 print(f'step={i}; loss={loss:.2f}')

If you run this loop for more than 10 iterations the loss starts growing again.

In the text, it's said that this is because the learning rate must be progressively decreased in practice. In my opinion, this is because every time that loss.backward() is exectuted the gradients are "accumulated" rather than recomputed. If the gradients are reset to zero after each iteration, it converges to a minimum: ------------------------------------- Proposed code ------------------------------------------- for i in range(10): loss = quad_mae(abc) loss.backward() with torch.nograd(): abc -= abc.grad*0.01 abc.grad.fill(0) #New line print(f'step={i}; loss={loss:.2f}')

Let me conclude by congratulating you for this very clear explanation

Regards