Closed elbow-jason closed 5 years ago
Also, the number of epoch
s required for training a network has decreased. The term epoch
was formerly, and mistakenly, meant to reflect a count of the number of times the network had been trained on 1 inputs
and the matching 1 labels
. This concept was incorrect. An epoch
is the count of iterations through the (maybe batched) dataset.
This PR adds the
Annex.Optimizer
behaviour.So far, the
Annex.Optimizer
behaviour only has 1 functiontrain/3
. Thetrain/3
ofAnnex.Optimizer
is the exact same function specification as theAnnex.Learner
train/3
callback.It may be a good idea to create an
Annex.Learner.Trainer
behaviour. This would DRY up thetrain/3
specification that is currently in two places but must, in fact, be the same specification.Additionally, this PR adds
Annex.Optimizer.SGD
which is anAnnex.Optimizer
implementation for running mini-batch (or not batched at all) stochastic gradient descent.