glm-tools / pyglmnet

Python implementation of elastic-net regularized generalized linear models

http://glm-tools.github.io/pyglmnet/

MIT License

279 stars 83 forks source link

approximate gradients / hessians for probit to prevent over/underflow #250

Closed pavanramkumar closed 6 years ago

pavanramkumar commented 6 years ago

closes #243

jasmainak commented 6 years ago

@pavanramkumar Travis is not happy.

are you sure it's an underflow/overflow problem? Because the gradient checks are fine on master.

pavanramkumar commented 6 years ago

@jasmainak i think it depends in what range we're checking the gradients.

are you able to plot the loss on master? i'm deducing over/underflow because the loss is running into nans

my hypothesis is:

the approximate gradients i'm calculating are not good enough to pass the gradient checks.
the approximate gradients also don't guarantee that the loss decreases steadily.
but they are good enough because the accuracy test for beta passes.

let me explicitly check the betas.

codecov-io commented 6 years ago

Codecov Report

Merging #250 into master will increase coverage by 0.99%. The diff coverage is 95.65%.

@@            Coverage Diff             @@
##           master     #250      +/-   ##
==========================================
+ Coverage   58.44%   59.44%   +0.99%     
==========================================
  Files           7        7              
  Lines        1302     1339      +37     
  Branches      261      262       +1     
==========================================
+ Hits          761      796      +35     
  Misses        473      473              
- Partials       68       70       +2

Impacted Files	Coverage Δ
pyglmnet/pyglmnet.py	`79.46% <95.65%> (+1.24%)`	:arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 65bb9f7...615c6b9. Read the comment docs.

pavanramkumar commented 6 years ago

@jasmainak made some more changes for the gradient check. approximate logL to prevent under/overflow did the trick.

See Eq. (17-20) here: https://pdfs.semanticscholar.org/0c03/0537919f09575b9f2c0a98c62f6571bdceee.pdf

I also did some better nan checks and reintroduced the "loss always decreases" test for probit/cdfast which I had commented out in the previous commit.

Tests pass!

jasmainak commented 6 years ago

@pavanramkumar I'm happy if Travis is happy :) I didn't look closely at the equations but I'm confident it's correct.

jasmainak commented 6 years ago

might have been worth it to put a link to the paper in the comments, as otherwise someone looking at the code in the future may be lost ...