Closed bayerj closed 11 years ago
with stop functionality you mean the stop
variable that skips some yields?
We won't be able to remove everything not needed in the optimizer loop. For the stopping conditions (issue #5), everything that could cause a stop must be in the info
dictionary. That includes the loss. So we would have to calculate the loss in the optimizer, even if it isn't needed (e.g. GradientDescent).
But otherwise I thought as well that the stop seems unecessary. The same functionality of stop=n
can be achieved with a soft stop condition that returns info['n_iter'] % n == 0
Calculating the loss because it might be needed by a stopping criterion does not seem reasonable to me -- calculation of the loss might take several minutes for some models, like big RNNs.
IMHO, we would have to include loss calculation in the stopping criterion then. e.g.
def stop_if_loss_smaller_than_2(lossfunc):
def inner(info):
return lossfunc() < 2
return inner
This is of course stupid if several stopping critertions need the loss. Difficult.
Actually, this speaks for the generator idea of stopping criterions, because then every criterion could update info. But this could be implemented in another way as well.
Ha! Loss is mostly calculated on a validation set to check for convergence. So again, doing it internally does not help.
hmm. tricky problem. if info
was an object, we could lazy-evaluate the loss when it's needed. if the optimizer requires it, it caluclates it within __iter__
and stores it in info. if it doesn't need it but the stop condition requires it, it would be calculated outside the loop and then stored in info
. and if nobody needs it, it wouldn't be calculated.
this would make the info object a little complex though. it would need to know the loss function for example. on the plus side, the user wouldn't know about it because it's hidden away. the user would still only request info.loss
and not know about the internals.
I removed stop. The other problems will be resolved somewhere else.
I just realized that if we never calculate more than needed in the optimizers loop (because it can be done from the outside) we actually don't need the stop functionality. Yields are rather fast (compared to model evaluations). This would make code a lot easier.
Any objections?