davidrosenberg / mlcourse

Machine learning course materials.
https://davidrosenberg.github.io/ml2018
569 stars 267 forks source link

For risk function or empirical risk, show true gradient and minibatch gradient samples #27

Open davidrosenberg opened 7 years ago

davidrosenberg commented 7 years ago

On contour plot, probably. Sampling far away from minimum should show most minibatch gradients pointing in the right-ish direction. While closer to the minimum, not as good. Can do this with varying minibatch sizes... This feels like it would be a good interactive demo...

In the same vein, compare a full batch gradient step (say with e.g. line search?), vs a single epoch of stochastic or minibatch gradient descent. Show the paths for each. Could also show all the minibatch gradients evaluated at the initial point, and explain that the full batch gradient is just adding all those vectors together, end to end. While an epoch of minibatch gradient allows a recalculation of the direction after each step. Which seems it would much better, so long as the minibatches are taking you in roughly the right direction.

davidrosenberg commented 7 years ago

@vakobzar where do we stand with this? Thanks...