Closed haoma7 closed 5 years ago
In Section "Head gradients and the chain rule" z.backward(head_gradient) should be modified to y.backward(head_gradient)
z.backward(head_gradient)
y.backward(head_gradient)
In Section "Head gradients and the chain rule"
z.backward(head_gradient)
should be modified toy.backward(head_gradient)