Closed danijar closed 8 years ago
Do a single matrix multiplication for the training batch rather than looping over examples and performing vector multiplications. This also solves allows the Gradient interface to accept either single examples or batches.
Gradient
Not needed anymore since activation derivatives get the above derivative as a parameter now.
Do a single matrix multiplication for the training batch rather than looping over examples and performing vector multiplications. This also solves allows the
Gradient
interface to accept either single examples or batches.