Closed fabianp closed 8 years ago
I did a comparative study of the adaptive step size vs the 'auto' stepsize. I compared them both in terms in number of iterations and time. These are the results with logistic loss (results are very similar for other loss functions) and alpha = 1/n (this is the setting in the SAG paper by Schmidt) for three different datasets.
My take away: there is no big improvement in the adaptive step size strategy. There is a slight gain in terms of number of iterations (which is reasurring) but this slight advantage is lost in terms of time because of the overhead of the line-search criterion.
Having the possibility to rely on this line-search is very important for my usage of lightning since we are working with loss functions (exponential-like) that don't have a bounded Lipschitz constant, in which case we need to rely on such criterion to estimate the "local" Lipschitz constant.
Ideally, I would like to merge this but leaving stepsize='auto' as default. What do you think @mblondel ?
@mblondel : I fixed also the step size in get_auto_step_size, you were right, there was no n_samples in the formula :-( (it just makes the stepsize super small)
Code to replicate the figure: https://github.com/fabianp/saga_report/blob/master/code/adaptive_step_size.ipynb
I think I would prefer stepsize='line-search'
. WDYT? Other than that, looks good to me :)
Yes, in the Schmidt paper it is also refered to as "line-search". Let's go with that then.
I changed the name
@mblondel please merge if its good for you :-)
since we are working with loss functions (exponential-like) that don't have a bounded Lipschitz constant
Please add them to lightning :) (in a future PR of course)
Merged!
Great, thanks for the review! On Apr 25, 2016 5:08 PM, "Mathieu Blondel" notifications@github.com wrote:
Merged!
— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/scikit-learn-contrib/lightning/pull/66#issuecomment-214384054
TODO