HuwCampbell / grenade

Deep Learning in Haskell
BSD 2-Clause "Simplified" License
1.44k stars 84 forks source link

Topic/learning separation #11

Closed HuwCampbell closed 7 years ago

HuwCampbell commented 7 years ago

Goal here is to take learning out of back prop. Interestingly, I'm now returning a Gradient type from runBackwards, which is often () for layers like logit and tanh.

This means we don't have to shuffle around multiple sets of momentums, there can be just one in the layer.

Step one in getting it ready for parallel execution.