karpathy / micrograd

A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
MIT License
10.5k stars 1.52k forks source link

Fixing tiny issue #63

Open bit-soham opened 8 months ago

bit-soham commented 8 months ago

Hello Sir, I noticed a small issue in the code while watching you videos hope i provided a good solution for it Problem: If in a google colab I had Value Objects initialized and without reinitializing them I called the backward function in the a different cell twice then the new gradients calculated are added or multiplied over the previous gradients So i am just setting the grads to 0 when we are building_topo such that the ones that are not visited before are set to 0.0

minh-nguyenhoang commented 8 months ago

That's rather a feature and not a bug. Most modern deep learning frameworks actually do this because in a training loops, you may want to calculate gradient of multiple minibatches before actually optimize the parameters as you cannot compute all minibatches in a step. If you want to do that, you may want to leave to leaf nodes grad alone and only set the grad of non-leaf nodes to 0.

bit-soham commented 8 months ago

Thank you for your explaination. I had clearly misunderstood

That's rather a feature and not a bug. Most modern deep learning frameworks actually do this because in a training loops, you may want to calculate gradient of multiple minibatches before actually optimize the parameters as you cannot compute all minibatches in a step. If you want to do that, you may want to leave to leaf nodes grad alone and only set the grad of non-leaf nodes to 0.