udacity / sdc-issue-reports

29 stars 6 forks source link

term 1 lesson 4 section 20: Confusing notation #839

Closed prekup closed 6 years ago

prekup commented 7 years ago

In term 1 lesson 4 section 20 backpropagation algorithm there is some notation that causes me some confusion, especially when coming from section 19. At least it was so for me :)

You see, in the hidden layer weight update step you have \Delta w{ij} = \Delta w{ij} + \delta_j^h a_i, but shouldn't it be x_i? Or something similar, as it is supposed to be the input of the hidden layer. While at the same time for the output layer weight update a_j is already being used ...

Kind regards,

Radu

mvirgo commented 6 years ago

Hi Radu,

For the first part - it's still being denoted as a_i instead of x_i to keep the notation of the activated outputs of given layers consistent, even if the input doesn't have an activation function on it. a_i is the "output" from the input layer, a_j is the output from the hidden layer, a_k is the output from the final layer. I've added a few additional notes to clarify this, as I find it much easier myself if each part of the notation gets spelled out explicitly.

On the second part - a_j is the correct item to use. Note that a_j is noted as part of the input to the output unit (it's being multiplied by the weights going to the output unit), which in this case means it is the output from the hidden unit. This is correctly multiplied by the output error for this weight update step.