udacity / dlnd-issue-reports

5 stars 0 forks source link

Suggested clarification of BPTT equation derivation #561

Closed pdudero closed 6 years ago

pdudero commented 6 years ago

Greetings, Another student and I had difficulty understanding the "Backpropagation Through Time" equations in the RNN lessons. The problem is that the derivation depends on a particular form of the chain rule that is not often used, one for multiple dependent variables. You have a couple places in your video content where you review the chain rule, but the review only involves a single independent variable. It would be very, very helpful to update the content, however you want to do it, to make explicit the fact that the BPTT equation derivations are completely motivated by the multi-dependent-variable form of the chain rule. Currently the content only refers obliquely to "contributions" and "accumulative gradients". Thanks.

pdudero commented 6 years ago

FYI, some links to demonstrate the problem: https://discussions.udacity.com/t/difficulty-in-understanding-bptt/644784

https://classroom.udacity.com/nanodegrees/nd101/parts/6df11d93-4b44-4f35-b410-a63b2e852bb9/modules/96aa308a-b292-4506-b9f1-3e80821676ce/lessons/72ae4549-d6b9-4634-93c6-24964a41e598/concepts/ab9dd7dd-26fd-4ff6-8127-2665c9d4177d

https://classroom.udacity.com/nanodegrees/nd101/parts/6df11d93-4b44-4f35-b410-a63b2e852bb9/modules/96aa308a-b292-4506-b9f1-3e80821676ce/lessons/72ae4549-d6b9-4634-93c6-24964a41e598/concepts/e8aae31b-d77c-416c-82d7-82774154950e

And then on this page: https://classroom.udacity.com/nanodegrees/nd101/parts/6df11d93-4b44-4f35-b410-a63b2e852bb9/modules/96aa308a-b292-4506-b9f1-3e80821676ce/lessons/72ae4549-d6b9-4634-93c6-24964a41e598/concepts/593835fd-09a7-44a3-a1fc-56dadb02b2ca

...the quantity "W" is used before it is defined. Eq. 51 would seem to indicate that W is simply W1+W2+W3. But then when applying Eq. 52 to Eq. 51, you get either that y is a constant with respect to W, or else 1 = 3. So I can see no motivation at all for Eq. 51. There seems to be no connection between the mystical "accumulation of contributions" with the chain rule, from which it arises.

Ortalarel commented 6 years ago

Thank you for the comment. We will review and update the content accordingly.