Closed Genarito closed 11 months ago
Definitely we maximize the ll. The code snippet is likely wrong then. I think I was thinking "score" should be minimized, but that's not true.
In general, SciPy and other libraries with optimization algorithms focus is on finding the minimum. In maximum likelihood, we instead are maximizing the likelihood (or equivalently the log-likelihood). To implement MLE with those other libraries, we instead minimize the negative log-likelihood, which is equivalent to maximizing the log-likelihood. @CamDavidsonPilon would be able to confirm this, but I suspect the negative log-likelihood is being minimized behind the scenes.
The score function is the derivative of the log-likelihood. To find the maximum log-likelihood, we find the place where the score is zero (slope at a max is zero). Here, you could use a root-finding algorithm instead to solve (so we have two ways to find the point estimates).
but I suspect the negative log-likelihood is being minimized behind the scenes.
yea, min the neg ll or max the ll are equivalent. Internally we min the neg ll, but expose the ll to users via the score
function (btw @pzivich, the score
function on lifelines model mimics the score
of scikit learn models, and isn't the same as the derivative of the log-likelihood - confusing I know!)
Sorry, but I am a bit lost. In the example code I passed, the values returned by score
are both negative, if that function is returning log-likelihood then how can it be distinguished from negative log-likelihood?
I mean, a log-likelihood is probably going to be negative: it's the log of values between 0 and 1. When we discuss a neg log-likelihood vs log-likelihood, we are really talking about the shape of the log-likelihood surface (bowl shaped vs hill shaped).
I understand, thank you both! So the only change that should be made to the code snippet in the documentation is to change the comment # better model to the cph_l2
model line. If you want I can do a PR so as not to add work for you.
Regards
Feel free to send a PR!
Hi, first of all, thank you very much for this library! :100:
I'm reading the Log Likelihood documentation and it says:
But I'm looking that, in the example below, there is a comment
better model
that corresponds to the lowest (not highest) value:Is this an error in the code example or should the log-likelihood be minimized instead of maximized?
I really appreciate any help you can provide