BRML / climin

Optimizers for machine learning
Other
180 stars 66 forks source link

Remove stop #9

Closed bayerj closed 11 years ago

bayerj commented 12 years ago

I just realized that if we never calculate more than needed in the optimizers loop (because it can be done from the outside) we actually don't need the stop functionality. Yields are rather fast (compared to model evaluations). This would make code a lot easier.

Any objections?

rueckstiess commented 12 years ago

with stop functionality you mean the stop variable that skips some yields?

We won't be able to remove everything not needed in the optimizer loop. For the stopping conditions (issue #5), everything that could cause a stop must be in the info dictionary. That includes the loss. So we would have to calculate the loss in the optimizer, even if it isn't needed (e.g. GradientDescent).

But otherwise I thought as well that the stop seems unecessary. The same functionality of stop=n can be achieved with a soft stop condition that returns info['n_iter'] % n == 0

bayerj commented 12 years ago

Calculating the loss because it might be needed by a stopping criterion does not seem reasonable to me -- calculation of the loss might take several minutes for some models, like big RNNs.

IMHO, we would have to include loss calculation in the stopping criterion then. e.g.

def stop_if_loss_smaller_than_2(lossfunc):
    def inner(info):
        return lossfunc() < 2
    return inner

This is of course stupid if several stopping critertions need the loss. Difficult.

bayerj commented 12 years ago

Actually, this speaks for the generator idea of stopping criterions, because then every criterion could update info. But this could be implemented in another way as well.

bayerj commented 12 years ago

Ha! Loss is mostly calculated on a validation set to check for convergence. So again, doing it internally does not help.

rueckstiess commented 12 years ago

hmm. tricky problem. if info was an object, we could lazy-evaluate the loss when it's needed. if the optimizer requires it, it caluclates it within __iter__ and stores it in info. if it doesn't need it but the stop condition requires it, it would be calculated outside the loop and then stored in info. and if nobody needs it, it wouldn't be calculated.

this would make the info object a little complex though. it would need to know the loss function for example. on the plus side, the user wouldn't know about it because it's hidden away. the user would still only request info.loss and not know about the internals.

bayerj commented 12 years ago

I removed stop. The other problems will be resolved somewhere else.