SkalskiP / ILearnDeepLearning.py

This repository contains small projects related to Neural Networks and Deep Learning in general. Subjects are closely linekd with articles I publish on Medium. I encourage you both to read as well as to check how the code works in the action.
https://medium.com/@skalskip
MIT License
1.32k stars 466 forks source link

Numpy deep neural network #29

Open marxav opened 4 years ago

marxav commented 4 years ago

Thank you for this wonderful example, which helped me understanding the gradient descent implementation. I just noticed a minor mistake:

should be:

In addition:

should also be:

Otherwise, the code will not work, for instance if one wants to extend it to implement a regression use-case instead of a classification use-case (i.e. "none" instead of "softmax" in the final layer + court-circuiting the final activation function in the code).

pranftw commented 3 years ago

Not necessarily @marxav .If the derivative of the cost function to the activation of the output layer already takes into account "m", ie 1.) d(cost_fn)/d(activation) = (1/m)*((1-y/1-a) - y/a), then there is no need to again divide the parameters or other gradients by "m", because when it gets divided by "m" in 1.) ,it gets propagated to all the parameters and gradients.