glm-tools / pyglmnet

Python implementation of elastic-net regularized generalized linear models

http://glm-tools.github.io/pyglmnet/

MIT License

279 stars 83 forks source link

[MRG] callbacks and modified convergence criteria #252

Closed pavanramkumar closed 6 years ago

pavanramkumar commented 6 years ago

replaces #233 anf closes #232

pavanramkumar commented 6 years ago

test_glmcv() takes very long to pass and i think it's because of repeated fits in each fold for each reg_lambda.

i'm not yet convinced that we're getting the GLMCV() use case right with respect to what happens in R's glmnet, i.e. just fit the model along a regularization path with no cross-validation. i want to study the scikit-learn LassoCV use case more closely before we merge this.

codecov-io commented 6 years ago

Codecov Report

Merging #252 into master will decrease coverage by 0.6%. The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #252      +/-   ##
==========================================
- Coverage   73.58%   72.97%   -0.61%     
==========================================
  Files           4        4              
  Lines         617      618       +1     
  Branches      120      125       +5     
==========================================
- Hits          454      451       -3     
- Misses        123      128       +5     
+ Partials       40       39       -1

Impacted Files	Coverage Δ
pyglmnet/pyglmnet.py	`78.68% <100%> (-0.78%)`	:arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 01945c7...df05de7. Read the comment docs.

pavanramkumar commented 6 years ago

Ok, i reviewed LassoCV again and it does seem equivalent to GLMCV in what it does, which is in-turn equivalent to R's cv.glmnet.

However, we don't have a simple class that fits on a regularization path without selecting for the best reg_lambda with cross-validation. Something to think about for the future.

LGTM

jasmainak commented 6 years ago

@pavanramkumar :

However, we don't have a simple class that fits on a regularization path without selecting for the best reg_lambda with cross-validation. Something to think about for the future.

I would argue that you can give cv=1 and that would do it for you, no?

I removed callback from the public API (that is no documentation) for now because we use it only for testing and there is no compelling use case for users. Is it okay for you?

Also, I think we should document the stopping criteria. +1 for MRG once this is done!

pavanramkumar commented 6 years ago

@jasmaink looks good, it's fine to remove callback from the public API. for documenting convergence, i'd just add a comment line; something like:

# convergence achieved if learning_rate * norm(gradient) < tolerance

can you add this?

jasmainak commented 6 years ago

convergence achieved if learning_rate * norm(gradient) < tolerance

I also noticed a norm(beta) in the denominator ... can you clarify why that is needed?

jasmainak commented 6 years ago

@pavanramkumar just noticed a couple more things. Let me know. I can fix them for you if needed

jasmainak commented 6 years ago

Wouldn’t the evaluation of the second condition be short circuited in the if clause?

On Sat 8 Sep 2018 at 16:16, Pavan Ramkumar notifications@github.com wrote:

@pavanramkumar commented on this pull request.

In pyglmnet/pyglmnet.py https://github.com/glm-tools/pyglmnet/pull/252#discussion_r216139834:
     for t in range(0, self.max_iter):
if self.solver == 'batch-gradient': grad = _grad_L2loss(self.distr, alpha, self.Tau, reg_lambda, X, y, self.eta, beta)

if t > 1:

if np.linalg.norm(grad) / np.linalg.norm(beta) < tol / lr:
hmm, we get computational savings by not having to compute gradients and parameter norms in the first iteration

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/glm-tools/pyglmnet/pull/252#discussion_r216139834, or mute the thread https://github.com/notifications/unsubscribe-auth/APHiokUrXLldFvXJLKyjDV147BdMadW7ks5uZCWDgaJpZM4WcSFq .

-- Sent from my iPhone

pavanramkumar commented 6 years ago

Think of it as relative tolerance where you decide convergence based on the relative magnitude of the update term; relative because it's normalized by the parameter norm (instead of absolute tolerance where you simply decide convergence based on the magnitude of the update term)

I also noticed a norm(beta) in the denominator ... can you clarify why that is needed?

pavanramkumar commented 6 years ago

@jasmainak made all the changes we discussed. merge?

jasmainak commented 6 years ago

@pavanramkumar you now have a merge commit ... you can do interactive rebase to remove it. Let me know if I can push, then I can do it for you.

jasmainak commented 6 years ago

Otherwise LGTM

pavanramkumar commented 6 years ago

@jasmainak yes the merge commit didn't catch all the conflicts, so the tests failed. you can go ahead and rebase it!

jasmainak commented 6 years ago

okay rebased. Let's hope tests pass now

pavanramkumar commented 6 years ago

fixed conflicts. merging

jasmainak commented 6 years ago

@pavanramkumar looks like it slowed down one of the examples by quite a bit so CircleCI is now failing ...

glm-tools / pyglmnet

[MRG] callbacks and modified convergence criteria #252

Codecov Report

convergence achieved if learning_rate * norm(gradient) < tolerance

@pavanramkumar commented on this pull request.

if self.solver == 'batch-gradient': grad = _grad_L2loss(self.distr, alpha, self.Tau, reg_lambda, X, y, self.eta, beta)