scikit-learn-contrib / lightning

Large-scale linear classification, regression and ranking in Python
https://contrib.scikit-learn.org/lightning/
1.72k stars 214 forks source link

Support for Sparse Group Lasso? #104

Open AlejandroCatalina opened 7 years ago

AlejandroCatalina commented 7 years ago

Hi,

I've just found this library and seems pretty good, congratulations! I see that you support group lasso and I was wondering if there was a plan for supporting sparse group lasso, since that shouldn't be much of a trouble (with all due respect). I didn't dig in the code too much so it may even be there but, at first sight, I didn't see it.

I actually saw the @fabianp implementation of sparse group lasso in a personal repo from which you can have the standard group lasso solution just by setting the parameter alpha = 0 and I thought that it may be useful to have a fast and good implementation here.

Either way, thank you for this great work!

fabianp commented 7 years ago

Hey, I don't have plans of implementing it in the short future but patches are welcomed :-)

AlejandroCatalina commented 7 years ago

That's great, I'll take a deeper look at the code and see if I can give it a go.

Thank you for the fast answer.

On Tue, Dec 13, 2016, 12:10 Fabian Pedregosa notifications@github.com wrote:

Hey, I don't have plans of implementing it in the short future but patches are welcomed :-)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/scikit-learn-contrib/lightning/issues/104#issuecomment-266711478, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPSn5ur2mgZm0rPTjdEozXrGFv465lqks5rHn0-gaJpZM4LLmtn .

AlejandroCatalina commented 7 years ago

Could you point me to some documentation or algorithm to follow the code? I've been looking into it (specifically the solve_l1l2 function) and I don't really get it, i was expecting something more like https://github.com/fabianp/group_lasso/blob/master/group_lasso.py but doesn't seem so. I think it's harder to follow because it's a generic solver for any objective function, isn't it?

fabianp commented 7 years ago

Yes, in lightning there are multiple generic solvers. For sparse group lasso, you probably want to use coordinate descent or FISTA. So implementing the sparse group lasso amount to define the appropriate prox or update rule for these methods.

AlejandroCatalina commented 7 years ago

Okay thanks. Is there any difference in terms of performance between FISTA and coordinate descent? In the implementation, I mean, I see that coordinate descent is implemented using Cython while FISTA only in Python.

I'd say that both are pretty competitive, but I'm interested in large applications, will FISTA hold against the current coordinate descent implementation?

fabianp commented 7 years ago

Simplifying a lot I would say that they both have similar performance, but this will depend depend on things like sparsity/number of features / strong convexity

AlejandroCatalina commented 7 years ago

Okay great. I've implemented FISTA but I would like to give a shot to coordinate descent. I'll pr when I get home.

Thank you for your time.

On 14 Dec 2016, at 14:18, Fabian Pedregosa notifications@github.com wrote:

Simplifying a lot I would say that they both have similar performance, but this will depend depend on things like sparsity/number of features / strong convexity

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

AlejandroCatalina commented 7 years ago

I have a generic question: how do you indicate the groups to the classifier? I've been playing with this and I've implemented the Sparse Group Lasso but now that I am running some examples I don't know how to indicate which groups there are. Any pointers to doc or anything related?

fabianp commented 7 years ago

For now we don't as the existing group lasso implementation considers that groups are equal to the coefficient associated with the different classes (is dependent of a multiclass formulation). I would add a parameter groups=[] to the class, of size n_features, where each entry specifies to which group the coefficient belongs to.

AlejandroCatalina commented 7 years ago

Yes that's what I thought. That would make the class useful for regression wouldn't it? Right now there isn't a proper group lasso for regression, if I didn't misunderstood you.

On Wed, Dec 14, 2016, 18:11 Fabian Pedregosa notifications@github.com wrote:

For now we don't as the existing group lasso implementation considers that groups are equal to the coefficient associated with the different classes (is dependent of a multiclass formulation). I would add a parameter groups=[] to the class, of size n_features, where each entry specifies to which group the coefficient belongs to.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/scikit-learn-contrib/lightning/issues/104#issuecomment-267093854, or mute the thread https://github.com/notifications/unsubscribe-auth/AHPSn4JwUyD-GRhDCPV-ZtNcD-xcnaG7ks5rICNSgaJpZM4LLmtn .

fabianp commented 7 years ago

You are right, there isn't

AlejandroCatalina commented 7 years ago

Is there a plan to include it? I am playing with it but I'm afraid I may implement it in a non-optimal way, which would hurt performance on large scale applications.

Sorry for the persistence.

fabianp commented 7 years ago

no plan for me to work on that in the short term.