@karpathy
when I'm watching your zero_to_hero serial in youtube (and I think it's awesome)
I come up with an new idea of backprob of C[Xb], (I have reply in youtube as well)
the original method like this :
dC = torch.zeros_like(C)
for k in range(Xb.shape[0]):
for j in range(Xb.shape[1]):
ix = Xb[k,j]
dC[ix] += demb[k,j]
my method like this
dC = (F.one_hot(Xb).float().transpose(1, 2) @ demb).sum(0)
and I check that the grad is matched
that woks because we can convert the index format to an one_hot with matrix multiple @, then we can just use the backprob rule as the matrix multiple
@karpathy when I'm watching your zero_to_hero serial in youtube (and I think it's awesome) I come up with an new idea of backprob of C[Xb], (I have reply in youtube as well)
the original method like this : dC = torch.zeros_like(C) for k in range(Xb.shape[0]): for j in range(Xb.shape[1]): ix = Xb[k,j] dC[ix] += demb[k,j]
my method like this dC = (F.one_hot(Xb).float().transpose(1, 2) @ demb).sum(0)
and I check that the grad is matched![image](https://github.com/karpathy/nn-zero-to-hero/assets/20755758/31b22aae-4f14-414a-8bf6-e5c43210a49d)
that woks because we can convert the index format to an one_hot with matrix multiple @, then we can just use the backprob rule as the matrix multiple