Closed muschellij2 closed 4 years ago
Hi, thanks a lot! I don't have a lot of time to check the additions now, but I'll get back to you. Best, Christian.
Hi, I've now checked the additions.
I think adding the Jaccard-Index is a good idea, so I'm going to merge that and write a test. I'm also going to make the function lower case, because the other metric functions are in lower case, too.
Regarding the log-loss, did the supplied function work for you? cutpointr
should have thrown an error, because it has to return a matrix with multiple rows for vector inputs, for example log_loss(1:2, 4:5, 3:4, 7:8)
should return something like
log_loss
[1,] 379.9265
[2,] 449.0041
I changed res = cbind(-1 * sum(res))
to res = cbind(-1 * res)
in the function to make it work.
Anyway, I'm still trying to wrap my head around the meaning of that metric for a binary predictor, such as in this case.
For a small eps
the formula can be simplified to -1 (fp + fn) [some negative number]. The plot of cutpoint values vs. metric values (via plot_metric
) has the same shape as the one from misclassification_cost
for equal costs, where the latter is easier to interpret. misclassification_cost
of course also finds the same optimal cutpoints at the minimal sum of misclassifications.
Was there a certain application for the log loss?
These seem like great enhancements! Many kaggle competitions use log loss for the metric: https://www.kaggle.com/dansbecker/what-is-log-loss Dice/Jaccard is what I use for most imaging applications.
Best, John
On Thu, Dec 12, 2019 at 12:19 PM Christian Thiele notifications@github.com wrote:
Hi, I've now checked the additions.
I think adding the Jaccard-Index is a good idea, so I'm going to merge that and write a test. I'm also going to make the function lower case, because the other metric functions are in lower case, too.
Regarding the log-loss, did the supplied function work for you? cutpointr should have thrown an error, because it has to return a matrix with multiple rows for vector inputs, for example log_loss(1:2, 4:5, 3:4, 7:8) should return something like
log_loss
[1,] 379.9265 [2,] 449.0041
I changed res = cbind(-1 sum(res)) to res = cbind(-1 res) in the function to make it work.
Anyway, I'm still trying to wrap my head around the meaning of that metric for a binary predictor, such as in this case.
For a small eps the formula can be simplified to -1 (fp + fn) [some negative number]. The plot of cutpoint values vs. metric values (via plot_metric) has the same shape as the one from misclassification_cost for equal costs, where the latter is easier to interpret. misclassification_cost of course also finds the same optimal cutpoints at the minimal sum of misclassifications.
Was there a certain application for the log loss?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Thie1e/cutpointr/pull/24?email_source=notifications&email_token=AAIGPLV2FNFNG4U5JTZPLATQYJXCDA5CNFSM4JMPXDDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGXMIHY#issuecomment-565101599, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIGPLSFVQ6THXYCW6KK7PTQYJXCDANCNFSM4JMPXDDA .
Sure, on Kaggle we often develop a model for predicting probabilities and these are then judged by log-loss. cutpointr
just dichotomizes the probabilities, as you've noted, and then we could just as well use misclassification_cost
which is by default equal to the number of misclassifications, because it can be interpreted more easily than the log-loss and the "loss function" has the same shape.
Did you actually use this in a Kaggle competition or somewhere else? Did it lead to an improvement? I'm just not aware of log-loss for purely binary predictions. Usually, we'd optimize the underlying model directly for log-loss.
I haven't used in Kaggle directly. I've been using for other smaller projects, but I think the addition of Jaccard/Dice and the proposed addition of the misclassification loss you've described is sufficient Best, John
On Mon, Dec 16, 2019 at 11:00 PM Christian Thiele notifications@github.com wrote:
Sure, on Kaggle we often develop a model for predicting probabilities and these are then judged by log-loss. cutpointr just dichotomizes the probabilities, as you've noted, and then we could just as well use misclassification_cost which is by default equal to the number of misclassifications, because it can be interpreted more easily than the log-loss and the "loss function" has the same shape.
Did you actually use this in a Kaggle competition or somewhere else? Did it lead to an improvement? I'm just not aware of log-loss for purely binary predictions. Usually, we'd optimize the underlying model directly for log-loss.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Thie1e/cutpointr/pull/24?email_source=notifications&email_token=AAIGPLWH57WN4HF2PRCMB7TQY6JXVA5CNFSM4JMPXDDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEG67MJY#issuecomment-566097447, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIGPLRX7232NWPDXAZ57JTQY6JXVANCNFSM4JMPXDDA .
Oh, maybe this was a misunderstanding - misclassification_cost
is already in the package, so I was wondering why you were using log loss instead.
This is a bit tricky. I'm mainly worried that log loss with binary predictions may confuse some users, on the other hand it is apparently not obvious that misclassification_cost
with equal costs will find the same cutpoint.
I'm either going to add log loss anyway or I'm going to add a remark about in the help for misclassification_cost
.
I added Jaccard Index (https://en.wikipedia.org/wiki/Jaccard_index), which can be calculated from F1, but I figured it'd be good to have.
Also - added log-loss, but it's a little odd given that the values are thresholded.