Closed milancurcic closed 1 year ago
This now works. There's an API change to the network % train()
and network % update()
methods which now require an argument of class(optimizer_base_type)
. (I wonder if it's possible to make this optional
so we can default to sgd
).
Once an optimizer
is passed to network % update
, it's passed to layer % update()
for all layers. In layer % update
, the weights and biases are accessed from the internal layer representation and passed to optimizer % minimize()
. I borrowed the name minimize
from Keras. optimizer % optimize()
would be appropriate but sounds weird due to repetition. How about optimizer % update()
?
The optimizer step for conv2d
is currently not implemented but it may be easy to do so even in this PR (but convolutional training is broken anyway as explained in #142).
Thanks for bringing up the API change regarding the network methods. Making the optimizer argument optional and setting up SGD as default sounds like a good idea.
Regarding the naming, I think optimizer % minimize()
is good, as it captures the essence of the operation. Furthermore, I'll study all the updations in the code.
First attempt at defining the concrete optimizer procedure as a method of SGD optimizer type.
Currently defining the
minimize
subroutine aselemental
to allow a scalar/array/rank-agnostic interface. It's possible that this won't work for all cases if we discover new requirements but let's try this for the time being.