Momentum shouldn't be stored in the layers any more. This will free us up to use a broader set of optimisation algorithms. We will however need to provide a class for fast updates and manipulations of learnable parameters.
Gradient associated type family shouldn't exist, we'll just return a Network with gradient weights.
randomNetwork shouldn't exist. Networks where all layers have a Random instance will also have a Random instance.
Momentum shouldn't be stored in the layers any more. This will free us up to use a broader set of optimisation algorithms. We will however need to provide a class for fast updates and manipulations of learnable parameters.
Gradient
associated type family shouldn't exist, we'll just return a Network with gradient weights.randomNetwork
shouldn't exist. Networks where all layers have aRandom
instance will also have aRandom
instance.