scikit-learn-contrib / lightning

Large-scale linear classification, regression and ranking in Python
https://contrib.scikit-learn.org/lightning/
1.73k stars 214 forks source link

[MRG] Adaptive step size strategy for SAG/SAGA #66

Closed fabianp closed 8 years ago

fabianp commented 8 years ago

TODO

fabianp commented 8 years ago

I did a comparative study of the adaptive step size vs the 'auto' stepsize. I compared them both in terms in number of iterations and time. These are the results with logistic loss (results are very similar for other loss functions) and alpha = 1/n (this is the setting in the SAG paper by Schmidt) for three different datasets.

image

image

image

My take away: there is no big improvement in the adaptive step size strategy. There is a slight gain in terms of number of iterations (which is reasurring) but this slight advantage is lost in terms of time because of the overhead of the line-search criterion.

Having the possibility to rely on this line-search is very important for my usage of lightning since we are working with loss functions (exponential-like) that don't have a bounded Lipschitz constant, in which case we need to rely on such criterion to estimate the "local" Lipschitz constant.

Ideally, I would like to merge this but leaving stepsize='auto' as default. What do you think @mblondel ?

@mblondel : I fixed also the step size in get_auto_step_size, you were right, there was no n_samples in the formula :-( (it just makes the stepsize super small)

fabianp commented 8 years ago

Code to replicate the figure: https://github.com/fabianp/saga_report/blob/master/code/adaptive_step_size.ipynb

mblondel commented 8 years ago

I think I would prefer stepsize='line-search'. WDYT? Other than that, looks good to me :)

fabianp commented 8 years ago

Yes, in the Schmidt paper it is also refered to as "line-search". Let's go with that then.

fabianp commented 8 years ago

I changed the name

fabianp commented 8 years ago

@mblondel please merge if its good for you :-)

mblondel commented 8 years ago

since we are working with loss functions (exponential-like) that don't have a bounded Lipschitz constant

Please add them to lightning :) (in a future PR of course)

mblondel commented 8 years ago

Merged!

fabianp commented 8 years ago

Great, thanks for the review! On Apr 25, 2016 5:08 PM, "Mathieu Blondel" notifications@github.com wrote:

Merged!

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/scikit-learn-contrib/lightning/pull/66#issuecomment-214384054