Closed jasmainak closed 7 years ago
Merging #217 into master will decrease coverage by
0.03%
. The diff coverage is98.24%
.
@@ Coverage Diff @@
## master #217 +/- ##
==========================================
- Coverage 77.91% 77.87% -0.04%
==========================================
Files 4 4
Lines 498 470 -28
Branches 92 88 -4
==========================================
- Hits 388 366 -22
+ Misses 77 72 -5
+ Partials 33 32 -1
Impacted Files | Coverage Δ | |
---|---|---|
pyglmnet/utils.py | 32.55% <ø> (-12.06%) |
:arrow_down: |
pyglmnet/metrics.py | 100% <100%> (ø) |
:arrow_up: |
pyglmnet/pyglmnet.py | 81.34% <98.03%> (-0.52%) |
:arrow_down: |
doc/_themes/sphinx_rtd_theme/__init__.py | 100% <0%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update e7ff601...c3fa627. Read the comment docs.
@jasmainak the _logL()
within pyglmnet.py
is called by _loss()
which is only used to track the loss and determine convergence. the _logL()
within utils.py
is used for scoring the model.
In the former case, it makes sense to think of the log likelihood as a function of the paremeters (beta0
, beta
) since that is what you are optimizing wrt to. In the latter case, we want to score the prediction against the test data. Here, you want to separate the scoring from the estimation as much as possible. There are also numerical issues such as adding an eps
parameter to make sure we don't get -Inf
or NaN
.
Hence, the difference in API of the two functions. Let me know your suggestions to DRY them.
hmm ... I was thinking of abstracting away the likelihood as something that is a property of a distribution. The loss function that we have of course takes the arguments beta
as that is how we parametrize it.
I'm also thinking in terms of trivializing how future contributors can add a new distribution. It should be as simple as updating the log likelihood and it's derivative. The link function and rest should all be taken care of automatically because of our API design.
Worst case, we can add a parameter (mode='scoring'
for example) that chooses the code path depending on weather we are scoring or doing the loss function.
ready for merge from my end!
ping @pavanramkumar . Did you get a chance to look at the diff?
@jasmainak i really like what you did with the chain rule. barring a few small issues i pointed out, it LGTM! thanks for looking into this carefully!
@pavanramkumar let me know about log likelihood. The other comment is addressed
@pavanramkumar I removed log_likelihood
for now. If users request it later, it's cheap to add it back. The other comment is addressed
i'm ok with log_likelihood
but check out my comment regarding the underflow prevention for binomial log likelihood (binomial cross entropy). it's a special case of the well known log-sum-exp trick, see e.g.: https://hips.seas.harvard.edu/blog/2013/01/09/computing-log-sum-exp/ (hint: interview question!)
I will merge this PR now!
This is built on top of https://github.com/glm-tools/pyglmnet/pull/216.
@pavanramkumar there are some differences between the two functions that I don't understand. Perhaps you could explain to me over chat