yngvem / group-lasso

Group Lasso implementation following the scikit-learn API
MIT License
105 stars 32 forks source link

Improve sklearn compatibility #16

Closed normanius closed 3 years ago

normanius commented 4 years ago

Hi Ynge

Again me... Probably it's already an issue on your backlog: The estimators do not fully comply with the sklearn interface: Rolling your own estimator. I also found this template useful.

I had a look at this today and was able to pass sklearn's compatibility checks with a couple of edits. I added a corresponding test-case in test/test_group_lasso.py.

Unfortunately, while sklearn now is happy, I broke some of the tests that have y as a multi-label with y.shape[1]>1. I wasn't able to fix this right away (it got late here)... I'm confident that you're more efficient in fixing these issues anyways. Take my commits as suggestions.

HTH

Update Note: For sklearn compatibility checks to pass, I had to set l1_reg=0 and group_reg=0, because LogisticGroupLasso has quite some convergence problems, which sklearn does not tolerate. (One might set could set the poor_score tag, but that's not what we want.)


To check sklearn compatibility:

from sklearn.utils.estimator_checks import parametrize_with_checks, check_estimator
from group_lasso import GroupLasso, LogisticGroupLasso

# The actual checks.
check_estimator(GroupLasso)
check_estimator(LogisticGroupLasso)

# As test case.
@parametrize_with_checks([GroupLasso, LogisticGroupLasso])
def test_sklearn_compatible_estimator(estimator, check):
    check(estimator)

Rough overview:

pep8speaks commented 4 years ago

Hello @normanius! Thanks for opening this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 407:55: E261 at least two spaces before inline comment Line 877:9: E265 block comment should start with '# '

Line 20:26: E124 closing bracket does not match visual indentation

yngvem commented 3 years ago

The latest version should pass the sklearn compatibility checks, so hopefully this is fixed.