Consider making predict_prob give class probabilities like sklearn

chriswbartley commented 4 years ago

Currently predict_prob gives [P(y>1), P(y>2), P(y>3)] cumulative probabilities, consider calculating class prpobabilities instead, and making these probabilities available as predict_cum_proba().

chriswbartley commented 3 years ago

Very close Fabricio! You are right that this is very confusing and inconsistent:

For binary, the predict_proba probabilities are same as sklearn ([P(y=-1), P(y=+1)],
For multiclass e..g for y in {1,2,3,4} the returned probabilities are [P(y>1), P(y>2), P(y>3)]

So yes you can calculate the individual class probabilities similarly to the way you mentioned: [1-P(y>1), P(y>1) - P(y>2),...]

However, I agree I should make this the same as sklearn, its confusing. it turned out this way as an artifact of the way the binary ensembling is done. Note that when you calculate individual class probabilities it may give some small negative numbers because the binary ensemble components are independent.

I'll have a go at changing the predict_proba() to sklearn standard now (via above calculator), though i may run out of time today :)

Cheers, Chris

On Tue, 10 Aug 2021 at 11:45, Fabricio Vasselai @.***> wrote:

Hi there, I got very interested in the algorithms from your 2019 AAAI paper (thanks for merging my Pull Request, btw) precisely because I am trying to find ways to estimate prediction intervals of ordinal classifications made with monotonic constraints.

So, perhaps I can help with a new Pull Request for this issue. However, I got confused with your notation above. Am I understanding correctly that those are cumulative ordinal class probabilities? That is, suppose the classification problem has n data points and m ordinal classes (ordered ascendingly from 1 to m), where y is the output variable with n observations indexed by i, such that y_i \in {1...m}. Then, for a given test point i, wouldn't it be the case that the predict_prob command() in 'monoensemble' actually returns a vector [Pr(y_i >= 1), Pr(y_i >= 2), ..., Pr(y_i = m)] instead of the one you described?

If yes, then the solution to this issue would be trivial: one could retrieve the actual ordinal class probabilities by simply subtracting backwards from the cumulation, e.g. class Pr(y_i = m-1) = Pr(y_i >= m-1) - Pr(y_i = m). Am I getting right what predict_prob() is returning?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/chriswbartley/monoensemble/issues/3#issuecomment-895705720, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAJB53KSKAO7I226SURGUDT4COGTANCNFSM4NX2K27Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

vasselai commented 3 years ago

Hi Chris, first of all, many thanks for the details. And apologies: I had deleted my original post here before you replied (I think you replied by email, that's why you did not notice it). I deleted it because I thought my original question was actually increasing confusion unnecessarily and I preferred to wait a bit until I understood what was going before re-posting.

It turns out I figured how to do it properly for the multi-class case. In case it helps, here's the temporary snippet I was working with:

import numpy as np
def cumToClassProb(cumPreds):
    preds = 1-cumPreds.copy()
    for col in range(0, preds.shape[1]-1):
        preds[:, col+1] = cumPreds[:, col] - cumPreds[:, col+1]
    preds = np.column_stack((preds, cumPreds[:, cumPreds.shape[1]-1]))
    return preds

Also, now that you have settled the issue, I will quote below the relevant part of my original post just so future readers can understand what you reacted to:

Perhaps I can help with a new Pull Request for this issue. However, I got confused with your notation above. Am I understanding correctly that those are cumulative ordinal class probabilities? That is, suppose the classification problem has n data points and m ordinal classes (ordered ascendingly from 1 to m), where y is the output variable with n observations indexed by i, such that y_i \in {1...m}. Then, for a given test point i, wouldn't it be the case that the predict_prob command() in 'monoensemble' actually returns a vector [Pr(y_i >= 1), Pr(y_i >= 2), ..., Pr(y_i = m)] instead of the one you described?

If yes, then the solution to this issue would be trivial: one could retrieve the actual ordinal class probabilities by simply subtracting backwards from the cumulation, e.g. class Pr(y_i = m-1) = Pr(y_i >= m-1) - Pr(y_i = m). Am I getting right what predict_prob() is returning?

Looking forward to the new commits with your final solution! Cheers.

chriswbartley commented 3 years ago

OK, I've just pushed a version that should make predict_proba() give predicted class probabilities. It looks to work well but admittedly I was a bit rushed, let me know if you find problems!

chriswbartley commented 3 years ago

Oh, and thanks for the code snippet, that looks right! 🙂

vasselai commented 3 years ago

Finished doing a few days of testing. All looks good on this front - I did find one error, which I will open another issue for. But I think you can close this one here.

chriswbartley commented 3 years ago

Great to hear - thanks for testing Fabricio!

chriswbartley / monoensemble

Consider making predict_prob give class probabilities like sklearn #3