Improve sklearn compatibility

normanius commented 4 years ago

Hi Ynge

Again me... Probably it's already an issue on your backlog: The estimators do not fully comply with the sklearn interface: Rolling your own estimator. I also found this template useful.

I had a look at this today and was able to pass sklearn's compatibility checks with a couple of edits. I added a corresponding test-case in test/test_group_lasso.py.

Unfortunately, while sklearn now is happy, I broke some of the tests that have y as a multi-label with y.shape[1]>1. I wasn't able to fix this right away (it got late here)... I'm confident that you're more efficient in fixing these issues anyways. Take my commits as suggestions.

HTH

Update Note: For sklearn compatibility checks to pass, I had to set l1_reg=0 and group_reg=0, because LogisticGroupLasso has quite some convergence problems, which sklearn does not tolerate. (One might set could set the poor_score tag, but that's not what we want.)

To check sklearn compatibility:

from sklearn.utils.estimator_checks import parametrize_with_checks, check_estimator
from group_lasso import GroupLasso, LogisticGroupLasso

# The actual checks.
check_estimator(GroupLasso)
check_estimator(LogisticGroupLasso)

# As test case.
@parametrize_with_checks([GroupLasso, LogisticGroupLasso])
def test_sklearn_compatible_estimator(estimator, check):
    check(estimator)

Rough overview:

Constructor argument "groups" requires default argument. I suggest None, which is interpreted as "each feature forms its own group".
fit() should return self
Added some missing input checks
sklearn by default squeezes prediction output to 1d arrays (unless multi-label output is enforced, not supported yet)
Issue a sklearn ConvergencWarning instead of a RuntimeWarning if FISTA fails to converge
LogisticGroupLasso: lacks member classes_
LogisticGroupLasso: decode result - my fix likely isn't very safe
Fix several minor things, typos, matrix shape issues

pep8speaks commented 4 years ago

Hello @normanius! Thanks for opening this PR. We checked the lines you've touched for PEP 8 issues, and found:

In the file src/group_lasso/_group_lasso.py:

Line 407:55: E261 at least two spaces before inline comment Line 877:9: E265 block comment should start with '# '

In the file test/test_group_lasso.py:

Line 20:26: E124 closing bracket does not match visual indentation

yngvem commented 3 years ago

The latest version should pass the sklearn compatibility checks, so hopefully this is fixed.

yngvem / group-lasso

Improve sklearn compatibility #16